×

Discussion Board

Page 2 of 2 FirstFirst 12
Results 16 to 23 of 23
  1. #16
    Registered User
    Join Date
    Jul 2012
    Location
    India
    Posts
    103

    Re: i want to extract text between div tag in j2me(java)

    Hi petrib

    Is it Correct for my requirement?,i am facing compile time error,can u suggest me?

    String regexfordesc="/(<div class=\"innercontenttxt\">)(.*?)(<\/div>)/ig";

    Quote Originally Posted by petrib View Post
    Modify the first regular expression I already gave you in http://www.developer.nokia.com/Commu...l=1#post900261

    (Moderators might just as well merge this thread with that one.)

  2. #17
    Registered User
    Join Date
    Nov 2009
    Posts
    63

    Re: i want to extract text between div tag in j2me(java)

    well, I don't understand your problem: you're getting the source code of the URL with the "getSourceFromUrl()"-method, that you can find here: http://code.huypv.net/2010/12/j2me-g...ml-source.html

    and next you are searching for the part you want to find (check, that it is only available one time in the source code using your PC browser). This means, creating a substring from "begin of needed part" to "end of needed part".
    now you need to remove all HTML tags from this code, and here I'm giving you the code I use for this:
    Code:
                            //Remove HTML-Tags
                            for (int i = 0; i < 50; i ++) {
                                int beginTag = 0; int endTag = 0;
                                if (note.indexOf("<") != -1) {
                                    beginTag = note.indexOf("<");
                                    if (note.indexOf(">") != -1) {
                                        endTag = note.indexOf(">");
    
                                        if (beginTag == 0) note = note.substring(endTag+1,note.length());
                                        else note = note.substring(0,beginTag) + note.substring(endTag+1,note.length());
                                    }
                                } else {
                                    i = 50;
                                }
                            }
    (this method is built for up to 50 HTML tags in the source part, it is possible to modify it and use "while (note.indexOf("<") != -1)" instead of "for(...)", then also removing the if and else in the while...)

    "note" is the substring you found before... and here you go...

  3. #18
    Registered User
    Join Date
    Jul 2012
    Location
    India
    Posts
    103

    Re: i want to extract text between div tag in j2me(java)

    Hiii schumi


    Thanks......I got my Requirement........






    Quote Originally Posted by schumi1331 View Post
    well, I don't understand your problem: you're getting the source code of the URL with the "getSourceFromUrl()"-method, that you can find here: http://code.huypv.net/2010/12/j2me-g...ml-source.html

    and next you are searching for the part you want to find (check, that it is only available one time in the source code using your PC browser). This means, creating a substring from "begin of needed part" to "end of needed part".
    now you need to remove all HTML tags from this code, and here I'm giving you the code I use for this:
    Code:
                            //Remove HTML-Tags
                            for (int i = 0; i < 50; i ++) {
                                int beginTag = 0; int endTag = 0;
                                if (note.indexOf("<") != -1) {
                                    beginTag = note.indexOf("<");
                                    if (note.indexOf(">") != -1) {
                                        endTag = note.indexOf(">");
    
                                        if (beginTag == 0) note = note.substring(endTag+1,note.length());
                                        else note = note.substring(0,beginTag) + note.substring(endTag+1,note.length());
                                    }
                                } else {
                                    i = 50;
                                }
                            }
    (this method is built for up to 50 HTML tags in the source part, it is possible to modify it and use "while (note.indexOf("<") != -1)" instead of "for(...)", then also removing the if and else in the while...)

    "note" is the substring you found before... and here you go...
    Last edited by pavanragi; 2012-07-26 at 13:24.

  4. #19
    Registered User
    Join Date
    Jul 2012
    Location
    India
    Posts
    103

    Html Parsing in j2me

    I have my Html String http://pastebin.com/2VY4ZU5C ,from this i want to extract the Description Value in j2me,After Extraction my out put should look like? can any one help me?

    OutPut:


    President Pranab pay great tributes to Motilal Nehru on occasion of </span>150th birth anniversary. Pranab said institutions evolved by leaders like him should be strengthened instead of being destroyed. <span style="mso-spacerun:yes">&nbsp;</span>He listed his achievements like his role in evolving of Public Accounts Committee and protecting independence of Legislature from the influence of the Executive by establishing a separate cadre for the Central Legislative Assembly, now Parliament. Calling himself a student of history, he said Motilal's Swaraj Party acted as a disciplined assault force in the Legislative Assembly and he was credited with evolving the system of a Public Accounts Committee which is now one of the most effective watchdogs over executive in matters of money and finance. Mukherjee also received the first set of coins and postal stamps released at the function to commemorate the event.

  5. #20
    Nokia Developer Moderator
    Join Date
    Feb 2006
    Location
    Oslo, Norway
    Posts
    28,748

    Re: Html Parsing in j2me

    Have you applied the
    Code:
    while (note.indexOf("<") != -1)
    suggestion?

  6. #21
    Registered User
    Join Date
    Jul 2012
    Location
    India
    Posts
    103

    Re: Html Parsing in j2me

    Hi wizard and schumi,
    thanks for reply,
    yes,i have applied the while logic too,i got my description after parsing(the URL http://www.teluguone.com/news/conten...-20-17680.html) and removing html tags like this output http://pastebin.com/TXFyvhZE

    Here my Logic which is used to get above Description(output):

    String readUrl = ReadUrl.readUrl(URL);
    int divIndex = readUrl.indexOf("<div class=\"innercontenttxt\">");
    divIndex = readUrl.indexOf(">", divIndex);

    int endDivIndex = readUrl.indexOf("</div>", divIndex);
    content = readUrl.substring(divIndex + 1, endDivIndex);


    //System.out.println("Content" + content);
    while (content.indexOf("<") != -1){

    int beginTag;
    int endTag;

    beginTag =content.indexOf("<");
    endTag = content.indexOf(">");
    if (beginTag == 0) {
    content = content.substring(endTag +
    1, content.length());
    }
    else {
    content =content.substring(0, beginTag) + content.substring(endTag
    + 1, content.length());
    }


    }

    String description = replace(content, "&quot;", "\"");
    description = replace(description, "&nbsp;", "");
    description = replace(description, "&rsquo;", "'");
    description = replace(description, "&lsquo;", "'");
    description = replace(description, "&ldquo;", "\"");
    description = replace(description, "&rdquo;", "\"");
    description = replace(description, "&ndash;", "-");

    description = replace(description, "&amp;", "&");
    System.out.println("Out" + description);
    Last edited by pavanragi; 2012-10-05 at 08:02.

  7. #22
    Nokia Developer Moderator
    Join Date
    Feb 2006
    Location
    Oslo, Norway
    Posts
    28,748

    Re: Html Parsing in j2me

    Yes, this output is correct for this attempt of removing tags.
    The problem what you see comes from the XML comment in the document, the <!-- ... -->. Since this simple mechanism checks for < and > pairs, the extra < in the <!-- causes impairing of everything between, that is why get back those mso-... style tags as 'content'.
    A step towards could be removing XML comments first, then the tags.

    You can simply re-use the code, just replace < with <!-- and > with -->, and adjust the lengths
    Code:
    while (content.indexOf("<!--") != -1)
    {
        int beginTag = content.indexOf("<!--");
        int endTag = content.indexOf("-->");
    
        if (beginTag == 0)
        { 
            content = content.substring(endTag + 3, content.length());
        } 
        else
        { 
            content =content.substring(0, beginTag) + content.substring(endTag + 3, content.length()); 
        }
    }
    
    while (content.indexOf("<") != -1)
    {
        int beginTag = content.indexOf("<");
        int endTag = content.indexOf(">");
    
        if (beginTag == 0)
        { 
            content = content.substring(endTag + 1, content.length());
        } 
        else
        { 
            content =content.substring(0, beginTag) + content.substring(endTag + 1, content.length()); 
        }
    }
    Because of the identical structure of the two loops, you can of course extract the whole thing into a method if you like.

  8. #23
    Registered User
    Join Date
    Jul 2012
    Location
    India
    Posts
    103

    Re: Html Parsing in j2me

    Hi Wizard,
    Thanks for your reply..

Similar Threads

  1. Replies: 3
    Last Post: 2008-08-01, 07:26
  2. How to show J2ME MIDP games on html page ?
    By sohan.soni in forum Mobile Java Games
    Replies: 9
    Last Post: 2006-05-20, 18:15
  3. How to show J2ME MIDP games on html page ?
    By sohan.soni in forum Mobile Java General
    Replies: 1
    Last Post: 2006-04-15, 23:06
  4. How to show J2ME MIDP games on html page ?
    By sohan.soni in forum Mobile Java General
    Replies: 5
    Last Post: 2006-04-04, 14:08
  5. Generating a HTML page from the J2ME application
    By ashishga in forum Mobile Java Networking & Messaging & Security
    Replies: 0
    Last Post: 2006-01-20, 11:21

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
×