Contact Us

The main task of the gwebcmd utility is getting web data in the CSV format.

gwebcmd loads web data, parses it, and save it to a CSV file.

gwebcmd can parse the following web data formats:

  • XML
  • JSON
  • HMTL
  • CSV

gwebcmd allows getting protected web data and supports the following authorization methods:

  1. Windows
  2. Basic
  3. Forms
  4. OAuth 1.0
  5. OAuth 2.0

The data in CSV format are used for further import into information systems.

If you need to use the data in Microsoft Excel, you may use SaveToDB add-in for Microsoft Excel that allows loading web data into Microsoft Excel.

Getting Web Data in CSV

The webtocsv mode is used to get the data and save it to a csv file.

The command line format is

gwebcmd webtocsv  <URL> [<output file name>] [<Options>]

Examples of getting data from XML data sources:

gwebcmd webtocsv https://api.datamarket.azure.com/WorldBank/WorldDevelopmentIndicators/v1/GetCountries?LanguageCode=%27en%27 GetCountries.csv
gwebcmd webtocsv http://api.worldbank.org/incomeLevels/LIC/countries countries.csv
gwebcmd webtocsv https://api.linkedin.com/v1/people/~ linkedin_info.csv
gwebcmd webtocsv https://docs.google.com/feeds/default/private/full?v=3 google_docs.csv
gwebcmd webtocsv http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20%28%22AAPL,GOOG,YHOO%22%29&diagnostics=false&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys quotes.csv
gwebcmd webtocsv http://social.yahooapis.com/v1/user/me/contacts yahoo_contacts.csv

Examples of getting data from JSON data sources: 

gwebcmd webtocsv https://apis.live.net/v5.0/me/contacts live_contacts.csv
gwebcmd webtocsv https://graph.facebook.com/me/friends facebook_friends.csv
gwebcmd webtocsv http://www.google.com/m8/feeds/contacts/default/base?alt=json google_contacts.csv
gwebcmd webtocsv http://api.linkedin.com/v1/people/~/connections:(id,first-name,last-name)?format=json linkedin_connections.csv
gwebcmd webtocsv https://api.twitter.com/1.1/statuses/user_timeline.json twitter_user_timeline.csv

Examples of getting data from CSV web pages:

gwebcmd webtocsv http://www.google.com/finance/historical?q=AAPL&output=csv google_csv_aapl.csv
gwebcmd webtocsv http://ichart.finance.yahoo.com/table.csv?s=AAPL&ignore=.csv yahoo_csv_aapl.csv

Examples of getting data from HTML web pages:

gwebcmd webtocsv http://finance.yahoo.com/q/op?s=AAPL yahoo_options_aapl.csv
gwebcmd webtocsv http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices yahoo_htm_aapl.csv
gwebcmd webtocsv http://www.google.com/finance/historical?q=AAPL google_htm_aapl.csv 

Customizing CSV Data

The following options are used to customize output CSV data:

/datetimeformat=<format>
/separator=<separator>|Tab
/add=<header=value>[<separator>...]

You may define required format for datetime values and a column separator, and also add constant data to a CSV output.

For example:

gwebcmd webtocsv http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices yahoo_htm_aapl.csv /datetimeformat=yyyy-MM-dd /separator=; /add=Symb=AAPL

The first rows of the result file:

_RowNum;Symb;Date;Open;High;Low;Close;Volume;Adj Close*
0;"AAPL";2014-07-07;420.86;422.98;417.45;420.8;8590100;420.8
1;"AAPL";2013-07-02;409.96;421.63;409.47;418.49;16780900;418.49
2;"AAPL";2013-07-01;402.69;412.27;401.22;409.22;13966200;409.22
3;"AAPL";2013-06-28;391.36;400.27;388.87;396.53;20661300;396.53

Getting Data in Cycle

Multiple requests to websites must be executed with a delay between requests.

Websites specify this requirement in the Terms of Service document.

Use the sleep mode to specify a delay in command files, in milliseconds.

For example:

gwebcmd sleep 500

Here is an example of a simple command file to get the data for a list of tickers:

@echo off
set gwebcmd="C:\Program Files (x86)\Gartle\gwebcmd\gwebcmd.exe"
@for /F %%i in (tickers.txt) do (
echo %%i
%gwebcmd% webtocsv http://finance.yahoo.com/q/hp?s=%%i+Historical+Prices %%i.csv /datetimeformat=yyyy-MM-dd /add=Symb=%%i
%gwebcmd% sleep 500
)

The gwebcmd variable is used to specify the full path of the gwebcmd executable.

You may add the path to the PATH variable and use gwebcmd.exe instead.

The tickers.txt file contains a ticker per line. For example:

AAPL
FB
GOOG
LNKD
MSFT
YHOO

Getting Web Text

The webtotext mode is used to get the text of a web page and save it to a file.

The command line format is

gwebcmd webtotext <URL> [<output file name>]

For example:

gwebcmd webtotext http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices aapl.htm

Parsing Text to CSV

The texttocsv mode is used to parse the text of a saved web page and save the CSV data to a file.

The command line format is

gwebcmd texttocsv <input file name> [<output file name>] [<Options>]

For example:

gwebcmd texttocsv aapl.htm aapl.csv

The options allow customizing CSV output and are the same as described above.

To Top

Connecting Web Data

gwebcmd allows loading protected web data.

gwebcmd supports the following authorization methods:

  1. Windows
  2. Basic
  3. Forms
  4. OAuth 1.0
  5. OAuth 2.0

gwebcmd shows a simple dialog box to enter a user name and a password for the first three methods.

The HTTP Query Wizard is used to provide credentials for OAuth protected pages.

User's authorization data are saved in the UserAuthData.txt file and used for further requests.

HTTP Query Wizard

The HTTP Query Wizard is used to provide credentials for OAuth protected pages.

Connecting Web Data

URL
The URL specified in the command line.
Service URL
A service URL is a root URL of a protected website area or a root URL of a web service.
User's authorization data is stored linked to service URLs.
Multiple URLs can use the authorization data of their service URLs.
gwebcmd tries to discover the real service URL of the specified URL.
You may add known service URLs into the ServiceURLs.txt file.
OAuth Provider
An OAuth provider used to authorize users to connect to protected web data.
Scope
Scope of requested permissions.
The scopes are different for different OAuth providers.
gwebcmd tries to discover the appropriate scope.
You may try to leave this field blank and test the connection.
If the connection is unsuccessful, read the OAuth provider documentation.
Consumer Key / Client ID
Consumer Key or Client ID of a registered application.
Consumer Secret / Client Secret
Consumer Secret or Client Secret of a registered application.
Redirect URL
The redirect URL of a registered application.

OAuth Provider Registered Applications

The OAuth authorization model requires registering applications to get the protected data.

The result data for OAuth 1.0 applications are Consumer Key, Consumer Secret, and Redirect URL.

The result data for OAuth 2.0 applications are Client ID, Client Secret, and Redirect URL.

These data are used in the wizard described above.

Use the following pages to register applications.

OAuth Provider Application Registration URL
Facebook https://developers.facebook.com/apps
LinkedIn https://www.linkedin.com/secure/developer
Twitter https://dev.twitter.com/apps
Yahoo https://developer.apps.yahoo.com/projects
Google https://code.google.com/apis/console/
Windows Live https://account.live.com/developers/applications/index
Windows Azure Marketplace https://datamarket.azure.com/developer/applications

To Top

Command Line

Command line formats:

gwebcmd help
gwebcmd webtocsv  <URL> [<output file name>] [<Options>]
gwebcmd webtotext <URL> [<output file name>]
gwebcmd texttocsv <input file name> [<output file name>] [<Options>]
gwebcmd sleep     <milliseconds>
where Options:
/accept=<accept>
/add=<header=value>[<separator>...]
/addrownum
/append
/datetimeformat=<format>
/inputcodepage=<codepage>
/outputcodepage=<codepage>
/relogon
/separator=<separator>|Tab

Mode Help

The utility shows a short description of command line options.

Mode WebToCSV

The utility loads the text from the specified URL, parses the text, and outputs the CSV data to the console or to the specified file.

If the web data in XML or JSON format contains a next data URL, the utility loads the data until the end.

Example:

gwebcmd webtocsv http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices aapl.csv

Mode WebToText

The utility loads the text from the specified URL and outputs it to the console or to the specified file.

The utility loads only the specified URL text ignoring a next data URL.

So, WebToText and TextToCSV are not the same as WebToCSV.

Example:

gwebcmd webtotext http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices aapl.htm

Mode TextToCSV

The utility parses specified file and outputs the CSV data to the console or to the specified file. 

Example:

gwebcmd texttocsv aapl.htm aapl.csv

Mode Sleep

The utility waits the specified amount of milliseconds.

This mode is used to get a timeout between requests.

Example:

gwebcmd sleep 500

Option Accept

This option is used to specify the Accept header to the webserver request.

You may specify this option if the default value is not acceptable.

Example:

/accept=application/json;odata=verbose

Option Add

This option is used to specify additional data in the CSV output.

Example:

gwebcmd webtocsv http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices aapl.csv /add=File=aapl.csv;Symb=AAPL

In this example two columns are added: File and Symb, with constant values aapl.csv and AAPL accordingly.

Option AddRowNum

If the option is specified, the first column with row numbers are added to the output.

Option Append

If the option is specified, the data are being added to the output file.

Option DateTimeFormat

This option is used to specify format for datetime values in the CSV output.

See http://msdn.microsoft.com/en-us/library/zdtaw1bw(v=vs.100).aspx

Example:

gwebcmd webtocsv http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices aapl.csv /datetimeformat=yyyy-MM-dd

Use quotes to specify formats with spaces. For example:

"/datetimeformat=yyyy-MM-dd hh:mm:ss"

Option InputCodePage

This option allows specifying the input file code page.

Example:

/inputcodepage=65001

Option OutputCodePage

This option allows specifying the output file code page.

Example:

/outputcodepage=1250

Option Relogon

This option launches the Connecting Web Data dialog box even the URL has been authorized.

Option Separator

This option is used to specify CSV separator.

The default separator is a semicolon.

Use the Tab value to specify the tab.

Example:

gwebcmd webtocsv http://finance.yahoo.com/q/hp?s=AAPL+Historical+Prices aapl.csv /separator=,

To Top

Exit Codes

Exit Code Description
0 Success
1 Non complete command line parameters
2 Wrong command line parameters
3 Exception
>200 HTTP Status Code

To Top

Data Files

gwebcmd uses the following files for storing application data:

  • ServiceUrls.txt
  • AppOAuthProviders.txt
  • UserOAuthProviders.txt
  • UserAuthData.txt

ServiceUrls.txt

The file contains known service URLs.

You may add new URLs if the utility does not recognize them in the Connecting Web Data dialog box.

AppOAuthProviders.txt

The file contains known OAuth providers.

UserOAuthProviders.txt

The file contains data of OAuth provider registered applications including ClientId, ClientSecret, and RedirectURL fields.

The file is stored in the current directory. So, you may copy it to other directories to use the configured OAuth providers.

The sensitive information is encrypted using Windows encryption features.

UserAuthData.txt

The file contains user's authorization data.

The file is stored in the current directory. So, you may copy it to other directories to use the ready authorization data.

The file can be edited using any text editor to copy or clear authorization data.

The sensitive information is encrypted using Windows encryption features.

To Top