• Listen to a special audio message from Bill Roper to the Hive Workshop community (Bill is a former Vice President of Blizzard Entertainment, Producer, Designer, Musician, Voice Actor) 🔗Click here to hear his message!
  • Read Evilhog's interview with Gregory Alper, the original composer of the music for WarCraft: Orcs & Humans 🔗Click here to read the full interview.

How to interface with website via Python?

Status
Not open for further replies.
Level 15
Joined
Aug 7, 2013
Messages
1,338
Hi,

How can I use Python to fill in website forms / click buttons?

I know how to open a url and manipulate the url to extract pages and stuff, but some websites don't put any such information in the url.
 
Level 29
Joined
Jul 29, 2007
Messages
5,174
Depending on how the page was made, it can either be via a GET request (url parameters), or a POST request (where it is usually url-encoded data in the HTTP payload).

As a general guideline, forms tend to be POST, and the rest tend to be GET, but you can check the source to be sure.

In addition, you could use an HTTP packet sniffer add-on in your browser, e.g. Live HTTP Headers. With it you can know exactly what you are sending to the server and what the server returns.

For example:
Code:
GET path/to/page?param1&param2 HTTP/1.1
...
Code:
POST path/to/page HTTP/1.1
...
Content-Type: application/x-www-form-urlencoded
Content-Length: something

param1&param2

As to Python, a simple "python http" Google search should suffice.
 
Last edited:
Level 15
Joined
Aug 7, 2013
Messages
1,338
Ok so I wrote a script that can almost submit forms, but it's not working. Here's the script (I use mechanize btw).

This is the page I am starting at: http://www.wordreference.com/

When I read the response of submitting the form, I should get the contents of this page: http://www.wordreference.com/es/translation.asp?tranword=garden

However I don't get that at all. Why is my code not working? I filled in both of the controls (choose the word you want translated, and the dictionary) and then submitted the form.

Edit: Ok here's the problem. For some reason after I submit the response I actually am going to yahoo.com (url = 'https://www.yahoo.com/'). How is that possible?

Code:
import mechanize
import re

br = mechanize.Browser()
#ignore robots.txt
br.set_handle_robots(False)

#open the page
response = br.open("http://www.wordreference.com/")

#this is the only form available
#"sbox" = "search box?"  
br.select_form("sbox")

#choose the dictionary
#in this case, we want the English-Spanish dictionary
controlDict = br.form.find_control("dict")
#set the value to "enes" (English-Spanish)
controlDict.value = ["enes"]

#choose the word we are searching for
#in this case, we want to find how to say "garden" in Spanish
controlWord = br.form.find_control("w")
#set the value to "garden" (the word we want to translate)
controlWord.value = "garden"

print br.form

#now submit our response
response = br.submit()
s = response.read()

w = open("test.txt", 'w')
print>>w, s
w.close()
 
Last edited:
Status
Not open for further replies.
Top