×
Namespaces

Variants
Actions

Wordreference Translator using Qt C++

From Nokia Developer Wiki
Jump to: navigation, search

This article explains how to use web scraping to make a translator which uses Wordreference.com

Contents

Introduction

WordReference.com is an online translation dictionary for the pairs English-French, English-Italian, English-Spanish, French-Spanish, Spanish-Portuguese etc. Even if it provides an html API, I would like to show you how to implement web scraping with Qt.

HTML Inspector: helps you in Web Scraping

Before hands to do web scraping you need to know where the content you are insterested in is located. Every element we see in an HTML page is located in a particular place. To know that place we can use a webkit based browser like Chrome or Chromium. With this browser such operation is pretty simple. It's just matter of pressing the RMB on the element you are interested in this will popup a windows which will tell you the exact location of the element you are interested in.

For instance the translation in Wordreference.com is in the HTML element called "ol".

Let's code!

So we need a class which is able to:

  • Fetch HTML code
  • Get data out from the HTML.

To do this in Qt we have only one way: using webkit., since we cannot parse html with xml parser. So to remove things which are not interesting we load the HTML inside a QWebPage and get the data we want with a simple QWebFrame::findFirstElement as shown in the code below.

class wrMgr : public QObject
{
Q_OBJECT
public:
wrMgr(QObject *parent) :
 
QObject(parent)
{
mManager = new QNetworkAccessManager(this);
connect(mManager, SIGNAL(finished(QNetworkReply*)), SLOT(replyFinished(QNetworkReply*)));
}
 
 
 
void lookUp(const QString& word, dictionary dict)
{
QString url = "http://www.wordreference.com/enit/" + word;
QNetworkRequest r(url);
mManager->get(r);
}
 
private slots:
void replyFinished(QNetworkReply *reply){
if (reply->error() != QNetworkReply::NoError){
qWarning() << "ERROR:" << reply->errorString();
return;
}
 
// Scrap HTML
QString html = reply->readAll();
QWebPage page;
QWebFrame *frame = page.mainFrame();
frame->setHtml(html);
 
QWebElement translation = frame->findFirstElement("ol");
 
emit wordTranslated(translation.toPlainText());
}
 
signals:
void wordTranslated(const QString& translation);
 
private:
QNetworkAccessManager *mManager;
}

Conclusion

Web scraping is fun and easy to implement in Qt as here shown. Once more time webkit engine has shown is flexibility and power which allow us to make this operation just with few lines of code. I wrote this Wordreference translator class only to show you how to implement web scraping with webkit and Qt. But of course it can be used to create a vast myriad of applications.

Article Metadata
Compatibility
Platform(s):
Symbian
Device(s): All Qt platforms
Article
Keywords: web scraping, webkit, Qt
Created: gnuton (29 Jun 2011)
Last edited: hamishwillee (11 Oct 2012)
This page was last modified on 11 October 2012, at 01:13.
85 page views in the last 30 days.

Was this page helpful?

Your feedback about this content is important. Let us know what you think.

 

Thank you!

We appreciate your feedback.

×