Difference between revisions of "Anarchopedia:Bots"
(→Installing Pywikipediabot) |
(Undo revision 28111 by 203.91.119.42 (Talk)) |
||
(4 intermediate revisions by 3 users not shown) | |||
Line 2: | Line 2: | ||
Bots can automate tasks and perform them much faster than humans. If you have a simple task which you need to perform lots of times (an example might be to add a template to all pages in a category with 1000 pages) then this is a task better suited to a bot than a human. | Bots can automate tasks and perform them much faster than humans. If you have a simple task which you need to perform lots of times (an example might be to add a template to all pages in a category with 1000 pages) then this is a task better suited to a bot than a human. | ||
− | + | ===APIs for bots=== | |
+ | In order to make changes to Anarchopedias pages, a bot necessarily has to retrieve pages from Anarchopedia and send edits back. There are several APIs available for that purpose. | ||
− | + | * (api.php). This library was specifically written to permit automated processes such as bots make queries and post changes. Data is available in many different machine-readable formats ([[JSON]], [[XML]], [[YAML]],...). Features have been fully ported from the older Query API interface. | |
+ | *: '''Status:''' Available on all Anarchopedias projects, with a very complete set of queries. The ability to edit pages via API.php enabling bots to operate entirely without screen scraping. | ||
+ | |||
+ | * Screen scraping (index.php). Screen scraping, as mentioned above, involves requesting a Anarchopedia page, looking at the raw HTML code (what you would see if you clicked View->Source in most browsers), and then analyzing the HTML for patterns. There are very few reasons to use this technique anymore and it is mainly used by older bot frameworks written before the API had as many features. | ||
+ | *: '''Status:''' Deprecated. | ||
+ | |||
+ | * [[Special:Export]] can be used to obtain bulk export of page content in XML form. See [[MW:Manual:Parameters to Special:Export|Manual:Parameters to Special:Export]] for arguments; | ||
+ | *: '''Status:''' Built-in feature of MediaWiki, available on all Anarchopedias servers. | ||
+ | |||
+ | * Raw (Wikitext) page processing: sending a <code>action=raw</code> or a <code>action=raw&templates=expand</code> GET request to index.php will give the unprocessed wikitext source code of a page. | ||
+ | |||
+ | == Installing Pywikipediabot == | ||
+ | |||
+ | In short, make a subdirectory inside of your working directory and go there: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | $ mkdir mybot | ||
+ | $ cd mybot | ||
+ | </source> | ||
+ | |||
+ | Invoke Subversion checkout to download pywikipediabot: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | $ svn checkout http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia/ pywikipedia | ||
+ | </source> | ||
+ | |||
+ | Pywikipediabot is a very active project. Because of that you should periodically (maybe daily, maybe weekly) update your copy: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | $ cd pywikipedia/ | ||
+ | $ svn update | ||
+ | </source> | ||
== Logging in == | == Logging in == |
Latest revision as of 09:52, 16 November 2014
See in French : Bot
Bots can automate tasks and perform them much faster than humans. If you have a simple task which you need to perform lots of times (an example might be to add a template to all pages in a category with 1000 pages) then this is a task better suited to a bot than a human.
APIs for bots[edit]
In order to make changes to Anarchopedias pages, a bot necessarily has to retrieve pages from Anarchopedia and send edits back. There are several APIs available for that purpose.
- (api.php). This library was specifically written to permit automated processes such as bots make queries and post changes. Data is available in many different machine-readable formats (JSON, XML, YAML,...). Features have been fully ported from the older Query API interface.
- Status: Available on all Anarchopedias projects, with a very complete set of queries. The ability to edit pages via API.php enabling bots to operate entirely without screen scraping.
- Screen scraping (index.php). Screen scraping, as mentioned above, involves requesting a Anarchopedia page, looking at the raw HTML code (what you would see if you clicked View->Source in most browsers), and then analyzing the HTML for patterns. There are very few reasons to use this technique anymore and it is mainly used by older bot frameworks written before the API had as many features.
- Status: Deprecated.
- Special:Export can be used to obtain bulk export of page content in XML form. See Manual:Parameters to Special:Export for arguments;
- Status: Built-in feature of MediaWiki, available on all Anarchopedias servers.
- Raw (Wikitext) page processing: sending a
action=raw
or aaction=raw&templates=expand
GET request to index.php will give the unprocessed wikitext source code of a page.
Installing Pywikipediabot[edit]
In short, make a subdirectory inside of your working directory and go there:
$ mkdir mybot
$ cd mybot
Invoke Subversion checkout to download pywikipediabot:
$ svn checkout http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia/ pywikipedia
Pywikipediabot is a very active project. Because of that you should periodically (maybe daily, maybe weekly) update your copy:
$ cd pywikipedia/
$ svn update
Logging in[edit]
Your bot should be logged in to the particular project. Go into the directory "pywikipedia":
$ cd pywikipedia/
Make file "user-config.py" by using your favorite editor. It should contain something like:
mylang = 'en'
family = 'anarchopedia'
usernames['anarchopedia']['en'] = 'My Bot Name'
usernames['anarchopedia']['pl'] = 'My Bot Name'
usernames['anarchopedia']['sr'] = 'My Bot Name II' # you may use different names
By invoking "login.py" program...
$ python login.py
... you will get a prompt for inserting your password...
Checked for running processes. 1 processes currently running, including the current process. Password for user My Bot Name on wikinews:en:
... and, if everything is good, program will give to you a message like:
Logging in to anarchopedia:en as My Bot Name Should be logged in now
However, if you have more then one account (like it was described) and you have one password for all bots (which is little bit less safe, but much more practical), you should use options "-all" and "-pass":
python login.py -all -pass Checked for running processes. 1 processes currently running, including the current process. Password for all accounts: Logging in to anarchopedia:sr as My Bot Name II Should be logged in now Logging in to anarchopedia:en as My Bot Name Should be logged in now Logging in to anarchopedia:pl as My Bot Name Should be logged in now
Invoking Python[edit]
You may invoke scripts by typing "python" before them:
$ python mybot.py
Scripts[edit]
Here is a list of the existing bots with links to their descriptions:
Main bot scripts | Other bot scripts | ||||
---|---|---|---|---|---|
|
|
| |||
Auxiliary programs
|
|