how to get Full text using Tweepy - python-3.x

I am new to python and i'm using This script to get tweets. But the problem is that it is not giving full Text.Instead it is giving me URL of tweet.
output
'
"text": "#Damien85901071 #Loic_23 #EdwinZeTwiter #Christo33332 #lequipedusoir #Cristiano #RealMadrid_FR #realfrance_fr\u2026 ' ShortenURL",
what changes i need to make in this script to get full text ?

Look into Twitter's tweet_mode=extended option and the places in the Python code where you might need to add that to the script.

Related

how i can use like_by_feed of instapy bot?

hello when i use like_by_feed it dosen't working and just return
---->> Total of links feched for analysis: 0
---->> Total of links feched for analysis: 0
i think the problem of the xbath -->> //article/div[2]/div[2]/a this xpath use when the bot want to get linke of post
The xpath seems to be broken. You can use the following to fix the Like By Feed
Find the file "xpath_compile.py" (I'm on Windows : \Python\Lib\site-packages\instapy\xpath_compile.py)
Find and replace the text
xpath["get_links_from_feed"] = {"get_links":
"//article/div[3]/div[2]/a"}
with
xpath["get_links_from_feed"] = {"get_links":
"//article/div/div[3]/div[2]/a"}
Save the file and run your Bot again.

How to convert JSON file into PO?

Hello friends and colleagues...
A quick question How do Iconvert JSON file into PO?
I had a PO file with relevant translations, then I converted it to JSON on some website after that wrote a little script in NodeJS to translate keys via Google translate API and now I just want to convert this translated JSON back to PO...
Is there any easy way? I don't seem to find any working npm packages or anything else...
Please Help,
Thanks
Online tools like https://localise.biz/free/converter/po-to-json can help.
To translate using command line there's an open source repo on Git- https://github.com/2gis/i18n-json-po

Why pandas profiling isn't showing any output in ipython?

I've a quick question about "pandas_profiling" .So basically i'm trying to use the pandas 'profiling' but instead of showing the output it says something like this:
<pandas_profiling.ProfileReport at 0x23c02ed77b8>
Where i'm making the mistake?? or Does it have anything to do with Ipython?? Because i'm using Ipython in Anaconda.
try this
pfr = pandas_profiling.ProfileReport(df)
pfr.to_notebook_iframe()
pandas_profiling creates an object that then needs to be displayed or output. One standard way of doing so is to save it as an HTML:
profile.to_file(outputfile="sample_file_name.html")
("profile" being the variable you used to save the profile itself)
It doesn't have to do with ipython specifically - the difference is that because you're going line by line (instead of running a full block of code, including the reporting step) it's showing you the object itself. The code above should allow you to see the report once you open it up.

How to extract python output out of the cmd?

I am using cmd in Windows 7 and I have encounter the following problem:
I write the command python in cmd to enter my code in python, then follows:
import requests
r=requests.get("https://nameofthepege.com")
r.text
After that the whole console gets full of hmtl code. The last 200 to 300 linesof the output are visible but the rest are not. How can I see more lines?
Moreover, is there any way to extract the html code produced by the r.textcommand in a new file from within the python environment or the cmd?
Regarding your first question.
After that the whole console gets full of html code. The last 200 to
300 lines of the output are visible but the rest are not. How can I
see more lines?
Response: The CMD default buffer is limited to 300 lines. You should increase the CMD prompt buffer size.
The below tutorial explains how to do that:
https://www.tenforums.com/tutorials/94089-change-command-prompt-screen-buffer-size-windows.html
Regarding your second question.
Moreover, is there any way to extract the html code produced by the
r.text command in a new file from within the python environment or the
cmd?
Response: You can write the content from r.text into a file by creating a file with Python open() function. More information about Reading and Writing Files in the below link:
https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files

how to verify links in a PDF file

I have a PDF file which I want to verify whether the links in that are proper. Proper in the sense - all URLs specified are linked to web pages and nothing is broken. I am looking for a simple utility or a script which can do it easily ?!
Example:
$ testlinks my.pdf
There are 2348 links in this pdf.
2322 links are proper.
Remaining broken links and page numbers in which it appears are logged in brokenlinks.txt
I have no idea of whether something like that exists, so googled & searched in stackoverflow also. But did not find anything useful yet. So would like to anyone has any idea about it !
Updated: to make the question clear.
You can use pdf-link-checker
pdf-link-checker is a simple tool that parses a PDF document and checks for broken hyperlinks. It does this by sending simple HTTP requests to each link found in a given document.
To install it with pip:
pip install pdf-link-checker
Unfortunately, one dependency (pdfminer) is broken. To fix it:
pip uninstall pdfminer
pip install pdfminer==20110515
I suggest first using the linux command line utility 'pdftotext' - you can find the man page:
pdftotext man page
The utility is part of the Xpdf collection of PDF processing tools, available on most linux distributions. See http://foolabs.com/xpdf/download.html.
Once installed, you could process the PDF file through pdftotext:
pdftotext file.pdf file.txt
Once processed, a simple perl script that searched the resulting text file for http URLs, and retrieved them using LWP::Simple. LWP::Simple->get('http://...') will allow you to validate the URLs with a code snippet such as:
use LWP::Simple;
$content = get("http://www.sn.no/");
die "Couldn't get it!" unless defined $content;
That would accomplish what you want to do, I think. There are plenty of resources on how to write regular expressions to match http URLs, but a very simple one would look like this:
m/http[^\s]+/i
"http followed by one or more not-space characters" - assuming the URLs are property URL encoded.
There are two lines of enquiry with your question.
Are you looking for regex verification that the link contains key information such as http:// and valid TLD codes? If so I'm sure a regex expert will drop by, or have a look at regexlib.com which contains lots of existing regex for dealing with URLs.
Or are you wanting to verify that a website exists then I would recommend Python + Requests as you could script out checks to see if websites exist and don't return error codes.
It's a task which I'm currently undertaking for pretty much the same purpose at work. We have about 54k links to get processed automatically.
Collect links by:
enumerating links using API, or dumping as text and linkifying the result, or saving as html PDFMiner.
Make requests to check them:
there are plethora of options depending on your needs.
https://stackoverflow.com/a/42178474/1587329's advice was inspiration to write this simple tool (see gist):
'''loads pdf file in sys.argv[1], extracts URLs, tries to load each URL'''
import urllib
import sys
import PyPDF2
# credits to stackoverflow.com/questions/27744210
def extract_urls(filename):
'''extracts all urls from filename'''
PDFFile = open(filename,'rb')
PDF = PyPDF2.PdfFileReader(PDFFile)
pages = PDF.getNumPages()
key = '/Annots'
uri = '/URI'
ank = '/A'
for page in range(pages):
pageSliced = PDF.getPage(page)
pageObject = pageSliced.getObject()
if pageObject.has_key(key):
ann = pageObject[key]
for a in ann:
u = a.getObject()
if u[ank].has_key(uri):
yield u[ank][uri]
def check_http_url(url):
urllib.urlopen(url)
if __name__ == "__main__":
for url in extract_urls(sys.argv[1]):
check_http_url(url)
Save to filename.py, run as python filename.py pdfname.pdf.

Resources