Executing Javascript from Python
79 votes
I have HTML webpages that I am crawling using xpath. The etree.tostring of a certain node gives me this string:
<script>
<!--
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
//-->
</script>
I just need the output of escramble_758(). I can write a regex to figure out the whole thing, but I want my code to remain tidy. What is the best alternative?
I am zipping through the following libraries, but I didnt see an exact solution. Most of them are trying to emulate browser, making things snail slow.
- http://code.google.com/p/python-spidermonkey/ (clearly says
it's not yet possible to call a function defined in Javascript) - http://code.google.com/p/webscraping/ (don't see anything for Javascript, I may be wrong)
- http://pypi.python.org/pypi/selenium (Emulating browser)
Edit: An example will be great.. (barebones will do)
edited Jun 2, 2012 at 12:40 by dda
asked Apr 13, 2012 at 6:39 by jerrymouse
9 answers
75 votes
You can also use Js2Py which is written in pure python and is able to both execute and translate javascript to python. Supports virtually whole JavaScript even labels, getters, setters and other rarely used features.
import js2py
js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
""".replace("document.write", "return ")
result = js2py.eval_js(js) # executing JavaScript and converting the result to python string
Advantages of Js2Py include portability and extremely easy integration with python (since basically JavaScript is being translated to python).
To install:
pip install js2py
edited Aug 1, 2015 at 10:01
answered May 29, 2015 at 19:08 by Piotr Dabkowski
57 votes
Using PyV8, I can do this. However, I have to replace document.write with return because there's no DOM and therefore no document.
import PyV8
ctx = PyV8.JSContext()
ctx.enter()
js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
"""
print ctx.eval(js.replace("document.write", "return "))
Or you could create a mock document object
class MockDocument(object):
def __init__(self):
self.value = ''
def write(self, *args):
self.value += ''.join(str(i) for i in args)
class Global(PyV8.JSClass):
def __init__(self):
self.document = MockDocument()
scope = Global()
ctx = PyV8.JSContext(scope)
ctx.enter()
ctx.eval(js)
print scope.document.value
edited Sep 8, 2015 at 20:21 by Daniel F
answered Apr 13, 2012 at 7:07 by Kien Truong
33 votes
One more solution as PyV8 seems to be unmaintained and dependent on the old version of libv8.
PyMiniRacer It's a wrapper around the v8 engine and it works with the new version and is actively maintained.
pip install py-mini-racer
from py_mini_racer import py_mini_racer
ctx = py_mini_racer.MiniRacer()
ctx.eval("""
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
return a+c+b;
}
""")
ctx.call("escramble_758")
And yes, you have to replace document.write with return as others suggested
answered Mar 21, 2018 at 9:16 by Dienow
10 votes
You can use js2py context to execute your js code and get output from document.write with mock document object:
import js2py
js = """
var output;
document = {
write: function(value){
output = value;
}
}
""" + your_script
context = js2py.EvalJs()
context.execute(js)
print(context.output)
edited Oct 15, 2018 at 20:12
answered Sep 16, 2018 at 23:46 by Mirko
9 votes
You can use requests-html which will download and use chromium underneath.
from requests_html import HTML
html = HTML(html="<a href='http://www.example.com/'>")
script = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
return a+c+b;
}
"""
val = html.render(script=script, reload=False)
print(val)
# +1 425-984-7450
More on this read here
answered Apr 8, 2020 at 15:19 by Levon
6 votes
quickjs should be the best option after quickjs come out. Just pip install quickjs and you are ready to go.
modify based on the example on README.
from quickjs import Function
js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
escramble_758()
}
"""
escramble_758 = Function('escramble_758', js.replace("document.write", "return "))
print(escramble_758())
https://github.com/PetterS/quickjs
answered Oct 31, 2019 at 6:13 by echo
5 votes
Really late to the party but you can use a successor of pyv8 which is regularly maintained by a reputable organization (Subjective) named CloudFlare. Here is the repository URL:
https://github.com/cloudflare/stpyv8
answered Jul 13, 2022 at 16:01 by cstayyab
5 votes
PythonMonkey is a new alternative that uses Firefox's JS engine.
Just pip install pythonmonkey to get started.
import pythonmonkey as pm
some_js_code = """
function escramble_758() {
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
return a+c+b;
}
escramble_758()
"""
res = pm.eval(some_js_code)
print(res) # +1 425-984-7450
You can also directly require JavaScript files from Python using pythonmonkey.require
answered Jul 21, 2023 at 14:29 by Will Pringle
0 votes
In 2024, PyMiniRacer is unmaintained. New SotA is it's fork https://bpcreech.com/PyMiniRacer/
pip install mini-racer
from py_mini_racer import MiniRacer
ctx = MiniRacer()
ctx.eval("""
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
return a+c+b;
}
""")
ctx.call("escramble_758")