×

Discussion Board

Results 1 to 5 of 5

Hybrid View

  1. #1
    Registered User
    Join Date
    Jan 2012
    Posts
    33

    QXmlStreamReader to parse HTML String

    Hello,

    I'm trying to parse a Html page using QXmlStreamReader but Im stuck somewhere...
    I use this method to parse the data

    Code:
    void Person::findIt(QNetworkReply *repToRead)
    {
    
    
        QByteArray bytes = repToRead->readAll();  // bytes
    
        QString string(bytes); // string
    
        QXmlStreamReader xml(string);
    
    
        while (!xml.atEnd()) {
            if(xml.readNext()){
    
                qDebug() << xml.name();
    
                if(xml.name() == "head"){
                    xml.skipCurrentElement();
                }
    
            }else if(xml.isEndElement()){
                qDebug() << "endElement";
            }
    
        }
    
    }
    I would like to skip the <head> because of META tags.. and go straight to the body..

    I'm getting this in console:
    ""
    ""
    "html"
    ""
    "head"

    and nothing else..how can I bypass the header section?
    is there another way to parse html strings in Qt?

  2. #2
    Registered User
    Join Date
    Jan 2012
    Posts
    33

    Re: QXmlStreamReader to parse HTML String

    Solved using:


    Code:
        QByteArray bytes = repToRead->readAll();  // bytes
    
        QWebPage page;
        QWebFrame * frame = page.mainFrame();
    
    
        frame->setContent(bytes);
    
        QWebElement document = frame->documentElement();
        QWebElementCollection elements = document.findAll("div.className");
    
    
        foreach(QWebElement e, elements){
    
            qDebug()<< "e element" << e.toPlainText();
            m_numberChecked = e.toPlainText();
            emit numberChanged();
    
        }
    Last edited by francesco_it; 2012-01-25 at 09:00.

  3. #3
    Super Contributor
    Join Date
    Mar 2009
    Posts
    1,024

    Re: QXmlStreamReader to parse HTML String

    Hi Francesco,
    This problem has been discussed in some other threads here.
    The solution you found is actually good and it's the one I prefer the most, but it's makes use of Webkit.

    XML parser works only for well formed documents. HTML headers could contain JS scripts or CSS which makes it not XML compliant.
    If you want to check where is the problem, you can use http://www.w3schools.com/xml/xml_validator.asp.
    QString::remove(QRegExp("<head>.*</head>")) is a way to remove the header tag completely before parsing it.

  4. #4
    Registered User
    Join Date
    Jan 2012
    Posts
    33

    Re: QXmlStreamReader to parse HTML String

    Thank you for your answer Im new to Qt and Symbian and I would like to ask you if Webkit it's supported by Symbian^1 or Symbian^3

  5. #5
    Super Contributor
    Join Date
    Mar 2009
    Posts
    1,024

    Re: QXmlStreamReader to parse HTML String

    Ciao Francesco,
    Webkit is supported by Qt for Symbian. If you have more questions about Qt Smbian related stuff, feel free to ask in the Qt Symbian section.

Similar Threads

  1. WRT cannot parse JSON string.
    By ace.david in forum Symbian
    Replies: 9
    Last Post: 2012-04-27, 13:20
  2. Replies: 2
    Last Post: 2011-07-06, 20:28
  3. how to parse html
    By davidmaxwaterman in forum Qt
    Replies: 4
    Last Post: 2010-03-23, 15:34
  4. Parse xml String
    By devdattac in forum Symbian
    Replies: 5
    Last Post: 2009-07-08, 11:48
  5. Parse/tokenize a String...
    By diddytee in forum Mobile Java General
    Replies: 0
    Last Post: 2002-05-15, 08:14

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
×