Acentuation in php, showing latin symbols - string

I'm writing a program to parse html data from a portuguese website, thing is, when I echo the data I read I get those weird symbols:
Meu PC estragou e tenho um netbook que u
so para assuntos acadΩmicos. Que raiva, nπo roda nem CS aqui. Aff que raiva! Com
o pode? Jß coloquei todas as mais baixar configuraτ⌡es e nao roda!
the original text is:
Meu PC estragou e tenho um netbook que u
so para assuntos acadêmicos. Que raiva, não roda nem CS aqui. Aff que raiva! Com
o pode? Já coloquei todas as mais baixar configurações e nao roda!
notice the acentuation:
acadêmicos -> acadΩmicos
Já -> Jß
how do I fix this ? I already tried:
echo utf8_decode($assunto);
but it didn't work ! help!

Set the encoding in the header.
header( 'Content-Type: text/html; charset=UTF-8' );
http://php.net/manual/en/function.header.php
Then to make sure the browser uses UTF-8 meta tag in the HTML head:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

Related

Request com Python [closed]

Closed. This question is not written in English. It is not currently accepting answers.
Stack Overflow is an English-only site. The author must be able to communicate in English to understand and engage with any comments and/or answers their question receives. Don't translate this post for the author; machine translations can be inaccurate, and even human translations can alter the intended meaning of the post.
Closed 3 days ago.
Improve this question
Opa, sou novo por aqui, ainda não entendo algumas regras, mas estou com uma dúvida sobre a biblioteca "request" em python, não somente em python mas no request de uma forma geral. Minha dúvida é a seguinte: Estou acessando uma API (GraphQL do GitHub) e ela me limita a visualizar apenas 100 commits, como faço para poder obter os demais commits, sendo que toda vez que faço um request de forma sequencial ele me retorna os mesmos 100 elementos?
O que tentei foi exatamente executar dois requests seguidos, esperava receber 200 commits, mas recebi os mesmos 100.

Cannot import SongLyrics from lyrics_extractor

I wrote this code:
from lyrics_extractor import SongLyrics
apiKey = 'AIzaSyBDPVEi1OtzB3Nm6i9fd8HTkMCjsselIpM'
engineID = '35df92fbe0cad839c'
extract_lyrics = SongLyrics(apikey, engineID)
lyrics = extract_lyrics.get_lyrics("Reyes de la noche")
print(lyrics)
It is not working, I obtained this import error:
ImportError: cannot import name 'Songlyrics' from 'lyrics_extractor' (C:\Users\mica\AppData\Local\Programs\Python\Python39\lib\site-packages\lyrics_extractor_init_.py)
What could be wrong?
I found two problems:
You need to install the library with PIP:
pip install lyrics-extractor
apiKey has the wrong capitalization in line 4.
A further suggestion: the returned value is a dict, which looks nicer when printed like this:
from lyrics_extractor import SongLyrics
apiKey = 'AIzaSyBDPVEi1OtzB3Nm6i9fd8HTkMCjsselIpM'
engineID = '35df92fbe0cad839c'
extract_lyrics = SongLyrics(apiKey, engineID)
lyrics = extract_lyrics.get_lyrics("Reyes de la noche")
print(lyrics['title'])
print(lyrics['lyrics'])
Output:
Guasones – Reyes De La Noche Lyrics
Fuimos mucho mas que nada
Fuimos la mentira
Fuimos lo peor
Fuimos los soldados a la madrugada
Con esta ambición
Y ahora estoy en libertad
Y ahora que puedo pensar
En no volver hacer ese
El mismo de antes
Y que tristeza hay en la ciudad, amor
Sábado soleado
Y en el centro de la estatua del dolor
Me sentí parado
Fuimos muchos más que todos
Reyes de la noche
De esta tempestad
Si te vendí, si te robe, te traicione
Fui por uno más
Fuimos perros de la noche
Oxidados en tristeza
Y querer lo que querer
Sin tener que lastimar
Recordando que tu amor
Se robo la dignidad
Ahora olvidemos los dos
No volvamos a empezar
¿Para que?...
PS: I was manually writing out lyrics this morning for a song called "Brothers and Sisters" by Les Nubians. When I tried to use this library to get the lyrics for that song, like this:
extract_lyrics.get_lyrics("Brothers and Sisters")
... I got lyrics for a more popular song of the same title by another musical group. However, after adding the name of the group I wanted, I got the correct song's lyrics:
extract_lyrics.get_lyrics("Brothers and Sisters Nubians")
Installation
pip install lyrics-extractor
For more details go to this

ERROR when Parsing JSON Array as a JSON Object

I'm trying to convert news into json in EJS but I'm getting this error:
JSON.parse: bad control character in string literal at line 1 column 491 of the JSON data
My code in EJS:
var news = JSON.parse('<%- JSON.stringify(news) %>');
My JSON when I just use JSON.stringify(news):
[{"link":"https://www.google.com/appserve/mkt/p/AL8lKjMTbG3Cwo5S_mSyJIgnnTSnG5CShXJluGQmCXZQem2HZJYO3Xni6oi7YkO1bf0AiPEzctP9t7p4iXohF_oSmh4lekT6N4_Xi_3FsOwFL7v0aGYESdS7Hwy0ljNy94xgSIa-p44jadT8K72ncdZLathEWT9VLdgFzKPyIbWNdfzXjna8ZZ-l0MnFGvP7ctxpUmlnS0bdioEzr6Vea3RXuJ9ql7G66mK3Mmk8b88uTsEU0DgQibiPnhj57-Xddg","title":"PT anuncia candidatura de Fernando Haddad à Presidência no ...","snippet":"11 set. 2018 ... O prazo dado pelo TSE para o partido apresentar à Justiça Eleitoral o substituto
de Lula terminava às 19h desta terça. Na chapa original ..."},{"link":"https://www.google.com/appserve/mkt/p/AL8lKjOk4AvSk26yeM_CLTwL-pbhemUMarCpKtz2cjmDF5u-c1MKiPSNST_LKzGUhr5gIxpym10fbAyKebM_k5ztcDT4ccJLm7CiDn51AF8esXzqtzkfStIpvDMLDSzBEiJsY_vZ-n16Bq1rVLol6NPlA84ZHjMcdBFsRqyvAn1uWquy2kWhGINO24sm7grPZssXqNuqn3kE2VQWxUZrqU76ZfKon7_2sP_JnVoX616a7NqZ87w","title":"Pesquisa Ibope: Lula, 37%; Bolsonaro, 18%; Marina, 6%; Ciro, 5%","snippet":"20 ago. 2018 ... Em razão desse quadro jurídico, o Ibope pesquisou outro cenário, com o atual
candidato a vice na chapa de Lula, Fernando Haddad."},{"link":"https://www.google.com/appserve/mkt/p/APDk4sNLn4v5wG7jZc1IMxmNqeoSmmu3CoLrgwpbMBTtGaGIPP6qNZCYoZJE-h7JvLCpES_LHdTgnGxzrSzi7qXlTfpcw_NWkH9lhvjaWJZmZCbHZM5jPivEEvXVa-370EEkKothZUYVdHjcJ_7Gd0FtSViBkVPdM1eplOHC2c-cxE5l-6iUJv-eIvsaBoTNH8m-N7KeL55C2RHIhWox4t_O_J6TePdYShasXr2p9CljniJyBnJg2gJqpDnpEg","title":"PT registra candidatura de Lula a presidente com ato em frente ao ...","snippet":"15 ago. 2018 ... Presidente do PT, Gleisi Hoffmann, entrega registro da candidatura de Lula a
servidor do TSE — Foto: Nelson Jr./ASCOM/TSE ..."},{"link":"https://www.google.com/appserve/mkt/p/AL8lKjPgDq40q5PA8lJ4zEjCmTIcjEId9kKJi_u79_HEROFIYaeDAza3mH3HEG5yW1W_0o0zUQTvFVUd9gRF64I5kZ9WUqwlrXBv3N0-ficFsPjhfdxo4H3CqkPvbjICafkJlZM2LnCTC7sKH_Wu7emMCZEki0XlYNuHyOOp114XJK5GbvR33QyBK3e9sVl9hoOpXsP44bNFtPy2ntiTo5ew58YJgSaB59sg1xrtW7PdaGKbcHDsg6bRyw","title":"Fachin nega pedido da defesa de Lula para suspender inelegibilidade","snippet":"6 set. 2018 ... Fachin nega pedido para suspender condenação de Lula no caso do ... o pedido
da defesa do ex-presidente Luiz Inácio Lula da Silva para ..."},{"link":"https://www.google.com/appserve/mkt/p/APDk4sNLSBs3Q0r_tjyeUzsiMNvPILeN0DuMwwFLp7LWs11DJltsjOoj6lALGoBQ_HdCP5Pxa8rTJIe6SmXbEyi5iZ6CFNdrHxF0dwF2pf3J2UX0uIREc_MoNTGwEj53rHgTA21KV0gXTwNC4U-SPsRqFhI322S4AjnWeRr1h1ThAl1gJEEDEMSWm5-JEWPkSH4zrzjmbltEp_mKtMqG7PTLQdat0d_wIowh1wGQ3oMDrSM95XpVBOzvm2yg_t4wd5lh","title":"Entenda o que significa a impugnação da candidatura de Lula feita ...","snippet":"16 ago. 2018 ... As ações de impugnação ainda não impedem Lula de concorrer. O ministro Luís
Roberto Barroso foi sorteado como relator do registro de ..."},{"link":"https://www.google.com/appserve/mkt/p/AL8lKjMNo82Rv1wcvOpdbrQX7C7zD2sW7lFk0EBtJZZNs8rx0-tq2bYfyB_BsCGjt9fNa82lkSBH8Y4P_Of_uj9ryfcFxjIt_pbf49sLtNeAR529DWuje9j8Ps8-Z3qH2Wrh6N_Sb8P9eS88o26QKNLi94PDrPvFg7Q5Hv_i02L95dF_ZOV_avLu9sE0zzaX2l6hCQ1t-lXxegaf","title":"Comitê da ONU reafirma direito de Lula ser candidato, diz defesa ...","snippet":"10 set. 2018 ... Zanin afirmou que a defesa de Lula informará ao STF a nova decisão. Uma nova
decisão do Comitê de Direitos Humanos da ONU reafirmou ..."},{"link":"https://www.google.com/appserve/mkt/p/AFOm0uEggpSZolqYBXndC11CTFh4Bx4ojWZe-LQ4LpX3DXG401x0u8rk7AS41mhCRQqX6Ze7qXrfe0blgoqgeKkBvl2Roik39be5d8XX2HlZpL4bYGlC18o9U650kXjAkUn_vWc6YJRH5bUf1Hw15Zz4IlOE5idUbGmd0rnPl43DKMDReRgIHGc72dtcdXrz-ihJualpVoNpmfdaEQ","title":"Datafolha: sem Lula, Bolsonaro lidera e disputa fica acirrada ...","snippet":"31 jan. 2018 ... RIO — Pesquisa Datafolha divulgada nesta quarta-feira mostra que o ex-
presidente Luiz Inácio Lula da Silva , mesmo condenado pelo ..."},{"link":"https://www.google.com/appserve/mkt/p/AFIPhzVJ3RSCVztRel4V7tABx7g_RC0G4uE43Wb1CdAxSf810y0LN7XN0iU1Ub5cj7-anDSrgzKJX9k1sx0tGFtzZhnTSyvCbAjhgqEHA-rrGWKMVAaHUXewZYj5IBY","title":"Após tiros e sob tensão, Lula encerra caravana em Curitiba horas ...","snippet":"28 mar. 2018 ... Poucas horas depois de a caravana do ex-presidente Luiz Inácio Lula da Silva
chegar a Curitiba, ainda sob abalo pelo ataque que atingiu ..."},{"link":"https://www.google.com/appserve/mkt/p/AJ-PF7wwGAYu07EXXfFG9u-ipk-azIf0t1m96KDvmxh5uuiAjuDg9QFlysO2nHCs9x0IRYCtpd_CphjEsDCZGMtxpcpzgeQdP-rJumaQVCIx6a2YCXteM6qFB6cerc5JSvbVfp5qTEwyDuB_R8gR9ZOK44cZxY2VIjPIAS4qsX-Wb2y0nSVXTtuM4BOZdPthrzxE8fXHoRc_zmYheufyS3FGXg","title":"Presidenciáveis e políticos comentam decisão de desembargador ...","snippet":"8 jul. 2018 ... O STF já havia negado HC para soltar Lula. Ele será solto neste domingo. A
decisão foi monocrática, de um desembargador que foi filiado ao ..."},{"link":"https://www.google.com/appserve/mkt/p/AIQrb_7rn0-8BzJy4lp5vhkK_EOS2rsg2pNbSKHMcSAv7Zxb_T8GoBAYLmUZidgqaFo8Q0sgjTMTLhJe8ouJ0iutSoUi_6XA04aBJe0-d6df3YTmby4CBj7BfXDOYO4mHhO7iFv9i2pOooL4TWu-A5KQ7KqAk-GQP72jmClTQuBmZ5Frl6rlGiaW1A","title":"STJ nega habeas corpus preventivo por unanimidade e decide que ...","snippet":"6 mar. 2018 ... Na mesma decisão, os cinco ministros da Quinta Turma do STJ negaram um
pedido extra da defesa para suspender a inelegibilidade de Lula ..."}]
EDIT: I have found out the error is the "snippet" tag, but I need it.
I think that JSON.parse() is going to have a tough time with what you are feeding it as it just sees a simple string, not evaluated EJS or even valid JSON.
Perhaps something more along the lines of:
var json = <your-JSON-here>;
var news = ejs.render('<%- JSON.stringify(news) %>', {news: json});
Then if you want, you can always pull it back in as an object with JSON.parse():
news = JSON.parse(news);

beautiful souphelp with defining the search criteria

I am trying to scrape a website with terms and their english translation and explanation. I have managed with Beautiful Soup and Request to get the tr or td entries, but I wonder whether someone could suggest how to refine my request to extract only the term and the explanation? I think that it is impossible to separate the english translation from the rest though?? The 'table' from the actual website is a combination of 19 tables(http://www.nodimarinari.it/Protopage10.html). The following is one entry
<tr>
<td valign="top" width="124">
<div align="center"><b><font color="#000066">tabella delle maree</font></b></div>
</td>
<td align="left" width="616"><font color="#000066"> (s.f.) (Maree) tide
table Tabella che riporta, per una determinata zona, l'andamento della
marea, cioe' giorno ed ora delle massime/minime e le relative variazioni
dei livelli delle acque. </font></td>
</tr>
<tr>
<td valign="top" width="124">
<div align="center"><font color="#000066"><b></b></font></div>
</td>
<td align="left" width="616"><font color="#000066"></font></td>
</tr>
Ideally I would like to get 'tabella delle maree','tide table', 'Tabella che riporta... '
Here is my results that according to my understanding of your question. Let me know if this what you're trying to achieve.
Code:
import requests
from bs4 import BeautifulSoup
url = requests.get("http://www.nodimarinari.it/Protopage10.html")
soup = BeautifulSoup(url.text, 'html.parser')
words = soup.find_all("font", color="#000066")
num = 0
term = []
explanation = []
for word in words:
if len(word.text) > 1:
if num % 2 == 0:
term.append(word.text)
elif num % 2 == 1:
explanation.append(str(word.text).replace("\n", ""))
num += 1
for i in range(0, len(term)):
print(term[i] + ": " + explanation[i] + "\n")
Output:
abbandonare: (v.) (Emergenze) to abandon L'attodi lasciare la imbarcazione pericolante da parte del capitano e dell'equipaggio,dopo aver esaurito tutti i tentativi suggeriti dall'arte nautica.
abbisciare: (v.) (Cavi e nodi) to range, tojag Preparare su uno spazio piano un cavo od una catena, ad ampie spire,in modo che il cavo o la catena possano svolgersi liberamente e scorreresenza impedimenti.
abbittare: (v.) (Cavi e nodi) to bitt Legareun cavo o una catena ad una bitta.
abbonacciare: (v.) (Vento e Mare) to becalm, tofall calm Il calmarsi del vento e del mare
abbordaggio: (s.m.) [A]. (Abbordi) collisionUrto o collisione accidentale tra imbarcazioni. La legislazione marittimaprevede un'ampio corredo di "norme per evitare l'abbordo in mare" checostituiscono la regolamentazione di base della navigazione [B]. a. intenzionaleboarding, running foul Investire intenzionalmente un'altra imbarcazionecon l'obiettivo di danneggiarla o affondarla. Anticamente si chiamavaanche "arrembaggio"
abbordare: (v.) (A bordo) to collide, to board,to run into vedi "abbordaggio".
abbordo: (s.m.) (Abbordi) collision Equivalentemoderno del termine "abbordaggio".
abbozzare: (v.) (Cavi e nodi) to stop Trattenerecon una legatura provvisoria, detta bozza, un cavo od una catena tesa,per evitarne lo scorrimento durante il tempo necessario per legarla definitivamente.
...
...
...
vogare: (v.) (Remi) to row Sinonimodi "remare".
volta: (s.f.) [A]. (Cavi enodi) bitter, wrap Indica comunemente un giro di un cavo o di una catenaattorno ad una bitta o ad un altro attrezzo atto a trattenere la cimastessa. [B]. dar v. (Manovre) to cleat, to belay, to make fast Legareuna cima od una catena attorno ad una bitta o una galloccia, con o senzanodi.
zattera di salvataggio: (s.f.) (A bordo) liferaftVedi "autogonfiabile".
zavorra: (s.f.) (Terminologia)ballast Pesi che vengono disposti a bordo dell'imbarcazione, normalmenteil piu' in basso possibile nella chiglia, per equilibrare la spinta lateraledel vento ed il conseguente movimento di rollio e di sbandata. Normalmentetali pesi sono in piombo e nei velieri moderni e' la lama di deriva stessa,costruita con metalli pesanti, ad essere zavorrata. In alcune particolarivelieri da regata la zavorra e' costituita da acqua di mare che puo' esserepompata in apposite casse (dal lato sopravvento) quando necessario.
zenit: (s.m.) (Geografia) zenitPunto della sfera celeste che si trova esattamente sulla verticale aldi sopra del luogo di osservazione.
zinco: (s.m.) (Varie) zinc,zinc plate Metallo dotato di particolari caratteristiche elettrolitichecol quale si realizzano piastre ed elementi (chiamati anodi) che vengonoutilizzati per evitare la corrosione di elementi metallici immersi adopera delle correnti galvaniche.Tali elementi vengono posizionati nellaparte immersa dello scafo, in corrispondenza di parti metalliche, in particolaredell'elica e della zavorra, e nel tempo si consumano evitando la corrosionedelle parti metalliche contigue. Per questo vengono anche chiamati zinchisacrificali.
Explanation:
So I used BeautifulSoup to parse all the terms and their meanings. To do that I had to parse all the text inside all the <font color="#000066"> elements inside the html code.
The words list contain the data in the following format:
[<font color="#000066">A</font>, <font color="#000066"> </font>, <font color="#000066"><b></b></font>, <font color="#000066"></font>, <font color="#000066">abbandonare</font>, <font color="#000066"> (v.) (Emergenze) to abandon L'atto
di lasciare la imbarcazione pericolante da parte del capitano e dell'equipaggio,
dopo aver esaurito tutti i tentativi suggeriti dall'arte nautica. </font>, <font color="#000066"><b></b></font>, <font color="#000066"></font>, <font color="#000066">abbisciare</font>, <font color="#000066"> (v.) (Cavi e nodi) to range, to
jag Preparare su uno spazio piano un cavo od una catena, ad ampie spire, ETC.]
The if condition if len(word.text) > 1: is used to ignore the single literals parsed (e.g: ['A', 'B', ..., 'Z']), that way we only deal with the terms and the meanings.
The next if condition if num % 2 == 0: is used to move all the terms from the words list to a list called term which is meant to contain only terms.
The last if condition if num % 2 == 1: is used to to move all the explanations from the words list to a list called explanation which is meant to contain only explanations.
The last for loop is for printing purposes only.

python 3.5 imaplib read gmails as plain text

mail = imaplib.IMAP4_SSL('imap.gmail.com', 993)
username = 'MyGmail#gmail.com'
password = 'MyPasswordHere'
mail.login(username, password)
mail.select('INBOX')
typ, data = mail.search(None, 'ALL')
for num in data[0].split():
typ, data = mail.fetch(num, '(RFC822)')
print(data)
exit()
This is a part of the whole output: (not very readable)
[(b'1 (BODY[1] {1115}', b'\r\nHej Bjango For at sikre, at Bjangos side hj=C3=A6lper dig med at n=C3=A5 =\r\ndine m=C3=A5l, giver vi dig her nogle hurtige og nemme forslag til, hvad =\r\ndu kan g=C3=B8re: Opdater dit profilbillede og dit coverbillede =\r\nOverf=C3=B8r=C2=A0billede Tilf=C3=B8j en beskrivelse af din side =\r\nTilf=C3=B8j=C2=A0en=C2=A0beskrivelse Medtag et link til dit website =\r\nTilf=C3=B8j=C2=A0et=C2=A0link Sl=C3=A5 en opdatering eller et billede op =\r\np=C3=A5 din side Opret=C2=A0et=C2=A0opslag Inviter dine venner til at =\r\nsynes godt om din side Inviter=C2=A0dine=C2=A0venner\r\n\r\nHilsen Facebook-teamet\r\n\r\n\r\n\r\n=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=\r\n=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\r\nDenne besked blev sendt til infobjango#gmail.com. Hvis du ikke =C3=B8nsker =\r\nat modtage disse e-mails fra Facebook fremover, skal du f=C3=B8lge =\r\nnedenst=C3=A5ende link for at afmelde dem.\r\nhttps://www.facebook.com/o.php?k=3DAS38jnuCT_H5AdZt&u=3D100015358233656&mi=\r\nd=3D548339b6125b3G5af6a3e64c38G0G37b\r\nFacebook, Inc., Attention: Community Support, 1 Hacker Way, Menlo Park, CA =\r\n94025\r\n\r\n'), b')']
So here is my question
How do I make the main part of the mail I received readable?
An example of what I mean by readable:
Dear bjango
This is a mail, which is totally readable without any "<<td style=3D"font-size: 16px; =\r\npadding-bottom: 26px; text-ali>". That's funny I just made a part of this mail unreadable that was supposed to be readable - I hope you get the point.
Best Regards Bjango
This approach does not completely solve my issue with readability:
import email
msg = email.message_from_bytes(data[0][1])
print(msg.get_payload(decode=False))
Output will be as followed:
Hej Bjango For at sikre, at Bjangos side hj=C3=A6lper dig med at n=C3=A5 =
dine m=C3=A5l, giver vi dig her nogle hurtige og nemme forslag til, hvad =
du kan g=C3=B8re: Opdater dit profilbillede og dit coverbillede =
Overf=C3=B8r=C2=A0billede Tilf=C3=B8j en beskrivelse af din side =
Tilf=C3=B8j=C2=A0en=C2=A0beskrivelse Medtag et link til dit website =
Tilf=C3=B8j=C2=A0et=C2=A0link Sl=C3=A5 en opdatering eller et billede op =
p=C3=A5 din side Opret=C2=A0et=C2=A0opslag Inviter dine venner til at =
synes godt om din side Inviter=C2=A0dine=C2=A0venner
Hilsen Facebook-teamet
But none the less, it's a massive improvement
This is the email I intend to get in the output section:
Hej Bjango For at sikre, at Bjangos side hjælper dig med at nå
dine mål, giver vi dig her nogle hurtige og nemme forslag til, hvad
du kan gøre: Opdater dit profilbillede og dit coverbillede
Overfører billeder Tilføj en beskrivelse af din side
Tilføj en beskrivelse Medtag et link til dit website
Tilføj et link Slå en opdatering eller et billede op
på din side Opret et opslag Inviter dine venner til at
synes godt om din side Inviter dine venner
Hilsen Facebook-teamet
Python 3 email parser should work here:
import email
msg = email.message_from_bytes(data[0][1])
payload = msg.get_payload(decode=True)
Assuming your email is encoded as MIME quoted-printable data, you can proceed to decode that using quopri module:
import quopri
message = quopri.decodestring(payload).decode('utf-8')
print(message)

Resources