Wordnet Find Synonyms - nlp

I am searching for a way to find all the synonyms of a particular word using wordnet. I am using JAWS.
For example:
love(v): admire, adulate, be attached to, be captivated by, be crazy about, be enamored of, be enchanted by, be fascinated with, be fond of, be in love with, canonize, care for, cherish, choose, deify, delight in, dote on, esteem, exalt, fall for, fancy, glorify, go for, gone on....
love(n):
Synonym : adulation, affection, allegiance, amity, amorousness, amour, appreciation, ardency, ardor, attachment, case*, cherishing, crush, delight, devotedness, devotion, emotion, enchantment, enjoyment, fervor, fidelity, flame, fondness, friendship, hankering, idolatry, inclination, infatuation, involvement
In a related question user Ram has pointed to some code but that does not suffice as it just gives a vastly different output:
love, passion: any object of warm affection or devotion beloved, dear, dearest, honey,
love: a beloved person; used as terms of endearment
love, sexual love, erotic love: a deep feeling of sexual desire and attraction
love: a score of zero in tennis or squash
sexual love, lovemaking, making love, love, love life: sexual activities (often including sexual intercourse) between two people
love: have a great affection or liking for
So how do I achieve it and is wordnet suited for what I want to do?

Sticking with just WordNet, you could try to use semantic similarity to determine if two words (synsets) are similar enough to be synonyms. Below is a quick example that came from modifying another of my answers on semantic similarity using WordNet.
It does have its problems though:
Antonyms are mixed in with synonyms
It is slow! (as it has to check all ~117k synsets)
Still, it produces more synonyms than using lemma_names alone, so I leave it here in case it might be useful (in conjunction with something else perhaps).
>>> from nltk.corpus import wordnet as wn
>>> def syn(word, lch_threshold=2.26):
for net1 in wn.synsets(word):
for net2 in wn.all_synsets():
try:
lch = net1.lch_similarity(net2)
except:
continue
# The value to compare the LCH to was found empirically.
# (The value is very application dependent. Experiment!)
if lch >= lch_threshold:
yield (net1, net2, lch)
>>> for x in syn('love'):
print x
Code above outputs:
(Synset('love.n.01'), Synset('feeling.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('conditioned_emotional_response.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('emotion.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('worship.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('anger.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('fear.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('fear.n.03'), 2.538973871058276)
(Synset('love.n.01'), Synset('anxiety.n.02'), 2.538973871058276)
(Synset('love.n.01'), Synset('joy.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('love.n.01'), 3.6375861597263857)
(Synset('love.n.01'), Synset('agape.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('agape.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('filial_love.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('ardor.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('amorousness.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('puppy_love.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('devotion.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('benevolence.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('beneficence.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('heartstrings.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('lovingness.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('warmheartedness.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('loyalty.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('hate.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('emotional_state.n.01'), 2.538973871058276)
(Synset('love.n.02'), Synset('content.n.05'), 2.538973871058276)
(Synset('love.n.02'), Synset('object.n.04'), 2.9444389791664407)
(Synset('love.n.02'), Synset('antipathy.n.02'), 2.538973871058276)
(Synset('love.n.02'), Synset('bugbear.n.02'), 2.538973871058276)
(Synset('love.n.02'), Synset('execration.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('center.n.06'), 2.538973871058276)
(Synset('love.n.02'), Synset('hallucination.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('infatuation.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('love.n.02'), 3.6375861597263857)
(Synset('beloved.n.01'), Synset('person.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('lover.n.01'), 2.9444389791664407)
(Synset('beloved.n.01'), Synset('admirer.n.03'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('beloved.n.01'), 3.6375861597263857)
(Synset('beloved.n.01'), Synset('betrothed.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('boyfriend.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('darling.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('girlfriend.n.02'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('idolizer.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('inamorata.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('inamorato.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('kisser.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('necker.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('petter.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('romeo.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('soul_mate.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('squeeze.n.04'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('sweetheart.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('desire.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('sexual_desire.n.01'), 2.9444389791664407)
(Synset('love.n.04'), Synset('love.n.04'), 3.6375861597263857)
(Synset('love.n.04'), Synset('aphrodisia.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('anaphrodisia.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('passion.n.05'), 2.538973871058276)
(Synset('love.n.04'), Synset('sensuality.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('amorousness.n.02'), 2.538973871058276)
(Synset('love.n.04'), Synset('fetish.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('libido.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('lecherousness.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('nymphomania.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('satyriasis.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('the_hots.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('bowling_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('football_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('baseball_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('basketball_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('number.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('score.n.03'), 2.9444389791664407)
(Synset('love.n.05'), Synset('stroke.n.06'), 2.538973871058276)
(Synset('love.n.05'), Synset('birdie.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('bogey.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('deficit.n.03'), 2.538973871058276)
(Synset('love.n.05'), Synset('double-bogey.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('duck.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('eagle.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('double_eagle.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('game.n.06'), 2.538973871058276)
(Synset('love.n.05'), Synset('lead.n.07'), 2.538973871058276)
(Synset('love.n.05'), Synset('love.n.05'), 3.6375861597263857)
(Synset('love.n.05'), Synset('match.n.05'), 2.538973871058276)
(Synset('love.n.05'), Synset('par.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bondage.n.03'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('outercourse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('safe_sex.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_activity.n.01'), 2.9444389791664407)
(Synset('sexual_love.n.02'), Synset('conception.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_intercourse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('pleasure.n.05'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_love.n.02'), 3.6375861597263857)
(Synset('sexual_love.n.02'), Synset('carnal_abuse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('coupling.n.03'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('reproduction.n.05'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('foreplay.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('perversion.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('autoeroticism.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('promiscuity.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('lechery.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('homosexuality.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bisexuality.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('heterosexuality.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bestiality.n.02'), 2.538973871058276)
# ...

First we have to ask the questions "What is synonym?", "Can synonyms be queried from the surface/root word?".
In WordNet, you have similar words representing the same concept under this term call the Synset and not at the surface word level.
To get synset's synonyms in the coverage of your example, you would require more than wordnet, possibly some semantic similarity methods to extract the other words.
I couldn't give you a JAWS explanation of what i mean above but from WordNet in NLTK interface for python. You can see that WN is insufficient for the coverage you want.
from nltk.corpus import wordnet as wn
for ss in wn.synsets('love'): # Each synset represents a diff concept.
print ss.definition
print ss.lemma_names
print
Code above outputs:
a strong positive emotion of regard and affection
['love']
any object of warm affection or devotion;
['love', 'passion']
a beloved person; used as terms of endearment
['beloved', 'dear', 'dearest', 'honey', 'love']
a deep feeling of sexual desire and attraction
['love', 'sexual_love', 'erotic_love']
a score of zero in tennis or squash
['love']
sexual activities (often including sexual intercourse) between two people
['sexual_love', 'lovemaking', 'making_love', 'love', 'love_life']
have a great affection or liking for
['love']
get pleasure from
['love', 'enjoy']
be enamored or in love with
['love']
have sexual intercourse with
['sleep_together', 'roll_in_the_hay', 'love', 'make_out', 'make_love', 'sleep_with', 'get_laid', 'have_sex', 'know', 'do_it', 'be_intimate', 'have_intercourse', 'have_it_away', 'have_it_off', 'screw', 'fuck', 'jazz', 'eff', 'hump', 'lie_with', 'bed', 'have_a_go_at_it', 'bang', 'get_it_on', 'bonk']

I have looked at the RiWordNet (RiTa) code (see getAllSynonyms method) and found that it generates synonyms by giving you all lemmas of synsets, hyponyms, similar tos, also sees, and coordinate names. I didn't include coordinate names but added antonyms. Furthermore, I added synset names and "synonym type" to my data so that I could make use of other Wordnet data like definition and/or examples. Here's my code in python and the output:
'''Synonym generator using NLTK WordNet Interface: http://www.nltk.org/howto/wordnet.html
'ss': synset
'hyp': hyponym
'sim': similar to
'ant': antonym
'also' also see
'''
from nltk.corpus import wordnet as wn
def get_all_synsets(word, pos=None):
for ss in wn.synsets(word):
for lemma in ss.lemma_names():
yield (lemma, ss.name())
def get_all_hyponyms(word, pos=None):
for ss in wn.synsets(word, pos=pos):
for hyp in ss.hyponyms():
for lemma in hyp.lemma_names():
yield (lemma, hyp.name())
def get_all_similar_tos(word, pos=None):
for ss in wn.synsets(word):
for sim in ss.similar_tos():
for lemma in sim.lemma_names():
yield (lemma, sim.name())
def get_all_antonyms(word, pos=None):
for ss in wn.synsets(word, pos=None):
for sslema in ss.lemmas():
for antlemma in sslema.antonyms():
yield (antlemma.name(), antlemma.synset().name())
def get_all_also_sees(word, pos=None):
for ss in wn.synsets(word):
for also in ss.also_sees():
for lemma in also.lemma_names():
yield (lemma, also.name())
def get_all_synonyms(word, pos=None):
for x in get_all_synsets(word, pos):
yield (x[0], x[1], 'ss')
for x in get_all_hyponyms(word, pos):
yield (x[0], x[1], 'hyp')
for x in get_all_similar_tos(word, pos):
yield (x[0], x[1], 'sim')
for x in get_all_antonyms(word, pos):
yield (x[0], x[1], 'ant')
for x in get_all_also_sees(word, pos):
yield (x[0], x[1], 'also')
for x in get_all_synonyms('love'):
print x
The output for 'love' and 'brave':
Love
(u'love', u'love.n.01', 'ss')
(u'love', u'love.n.02', 'ss')
(u'passion', u'love.n.02', 'ss')
(u'beloved', u'beloved.n.01', 'ss')
(u'dear', u'beloved.n.01', 'ss')
(u'dearest', u'beloved.n.01', 'ss')
(u'honey', u'beloved.n.01', 'ss')
(u'love', u'beloved.n.01', 'ss')
(u'love', u'love.n.04', 'ss')
(u'sexual_love', u'love.n.04', 'ss')
(u'erotic_love', u'love.n.04', 'ss')
(u'love', u'love.n.05', 'ss')
(u'sexual_love', u'sexual_love.n.02', 'ss')
(u'lovemaking', u'sexual_love.n.02', 'ss')
(u'making_love', u'sexual_love.n.02', 'ss')
(u'love', u'sexual_love.n.02', 'ss')
(u'love_life', u'sexual_love.n.02', 'ss')
(u'love', u'love.v.01', 'ss')
(u'love', u'love.v.02', 'ss')
(u'enjoy', u'love.v.02', 'ss')
(u'love', u'love.v.03', 'ss')
(u'sleep_together', u'sleep_together.v.01', 'ss')
(u'roll_in_the_hay', u'sleep_together.v.01', 'ss')
(u'love', u'sleep_together.v.01', 'ss')
(u'make_out', u'sleep_together.v.01', 'ss')
(u'make_love', u'sleep_together.v.01', 'ss')
(u'sleep_with', u'sleep_together.v.01', 'ss')
(u'get_laid', u'sleep_together.v.01', 'ss')
(u'have_sex', u'sleep_together.v.01', 'ss')
(u'know', u'sleep_together.v.01', 'ss')
(u'do_it', u'sleep_together.v.01', 'ss')
(u'be_intimate', u'sleep_together.v.01', 'ss')
(u'have_intercourse', u'sleep_together.v.01', 'ss')
(u'have_it_away', u'sleep_together.v.01', 'ss')
(u'have_it_off', u'sleep_together.v.01', 'ss')
(u'screw', u'sleep_together.v.01', 'ss')
(u'fuck', u'sleep_together.v.01', 'ss')
(u'jazz', u'sleep_together.v.01', 'ss')
(u'eff', u'sleep_together.v.01', 'ss')
(u'hump', u'sleep_together.v.01', 'ss')
(u'lie_with', u'sleep_together.v.01', 'ss')
(u'bed', u'sleep_together.v.01', 'ss')
(u'have_a_go_at_it', u'sleep_together.v.01', 'ss')
(u'bang', u'sleep_together.v.01', 'ss')
(u'get_it_on', u'sleep_together.v.01', 'ss')
(u'bonk', u'sleep_together.v.01', 'ss')
(u'agape', u'agape.n.01', 'hyp')
(u'agape', u'agape.n.02', 'hyp')
(u'agape_love', u'agape.n.02', 'hyp')
(u'amorousness', u'amorousness.n.01', 'hyp')
(u'enamoredness', u'amorousness.n.01', 'hyp')
(u'ardor', u'ardor.n.02', 'hyp')
(u'ardour', u'ardor.n.02', 'hyp')
(u'benevolence', u'benevolence.n.01', 'hyp')
(u'devotion', u'devotion.n.01', 'hyp')
(u'devotedness', u'devotion.n.01', 'hyp')
(u'filial_love', u'filial_love.n.01', 'hyp')
(u'heartstrings', u'heartstrings.n.01', 'hyp')
(u'lovingness', u'lovingness.n.01', 'hyp')
(u'caring', u'lovingness.n.01', 'hyp')
(u'loyalty', u'loyalty.n.02', 'hyp')
(u'puppy_love', u'puppy_love.n.01', 'hyp')
(u'calf_love', u'puppy_love.n.01', 'hyp')
(u'crush', u'puppy_love.n.01', 'hyp')
(u'infatuation', u'puppy_love.n.01', 'hyp')
(u'worship', u'worship.n.02', 'hyp')
(u'adoration', u'worship.n.02', 'hyp')
(u'adore', u'adore.v.01', 'hyp')
(u'care_for', u'care_for.v.02', 'hyp')
(u'cherish', u'care_for.v.02', 'hyp')
(u'hold_dear', u'care_for.v.02', 'hyp')
(u'treasure', u'care_for.v.02', 'hyp')
(u'dote', u'dote.v.02', 'hyp')
(u'love', u'love.v.03', 'hyp')
(u'get_off', u'get_off.v.06', 'hyp')
(u'romance', u'romance.v.02', 'hyp')
(u'fornicate', u'fornicate.v.01', 'hyp')
(u'take', u'take.v.35', 'hyp')
(u'have', u'take.v.35', 'hyp')
(u'hate', u'hate.n.01', 'ant')
(u'hate', u'hate.v.01', 'ant')
Brave
(u'brave', u'brave.n.01', 'ss')
(u'brave', u'brave.n.02', 'ss')
(u'weather', u'weather.v.01', 'ss')
(u'endure', u'weather.v.01', 'ss')
(u'brave', u'weather.v.01', 'ss')
(u'brave_out', u'weather.v.01', 'ss')
(u'brave', u'brave.a.01', 'ss')
(u'courageous', u'brave.a.01', 'ss')
(u'audacious', u'audacious.s.01', 'ss')
(u'brave', u'audacious.s.01', 'ss')
(u'dauntless', u'audacious.s.01', 'ss')
(u'fearless', u'audacious.s.01', 'ss')
(u'hardy', u'audacious.s.01', 'ss')
(u'intrepid', u'audacious.s.01', 'ss')
(u'unfearing', u'audacious.s.01', 'ss')
(u'brave', u'brave.s.03', 'ss')
(u'braw', u'brave.s.03', 'ss')
(u'gay', u'brave.s.03', 'ss')
(u'desperate', u'desperate.s.04', 'sim')
(u'heroic', u'desperate.s.04', 'sim')
(u'gallant', u'gallant.s.01', 'sim')
(u'game', u'game.s.02', 'sim')
(u'gamy', u'game.s.02', 'sim')
(u'gamey', u'game.s.02', 'sim')
(u'gritty', u'game.s.02', 'sim')
(u'mettlesome', u'game.s.02', 'sim')
(u'spirited', u'game.s.02', 'sim')
(u'spunky', u'game.s.02', 'sim')
(u'lionhearted', u'lionhearted.s.01', 'sim')
(u'stalwart', u'stalwart.s.03', 'sim')
(u'stouthearted', u'stalwart.s.03', 'sim')
(u'undaunted', u'undaunted.s.02', 'sim')
(u'valiant', u'valiant.s.01', 'sim')
(u'valorous', u'valiant.s.01', 'sim')
(u'bold', u'bold.a.01', 'sim')
(u'colorful', u'colorful.a.02', 'sim')
(u'colourful', u'colorful.a.02', 'sim')
(u'timid', u'timid.n.01', 'ant')
(u'cowardly', u'cowardly.a.01', 'ant')
(u'adventurous', u'adventurous.a.01', 'also')
(u'adventuresome', u'adventurous.a.01', 'also')
(u'bold', u'bold.a.01', 'also')
(u'resolute', u'resolute.a.01', 'also')
(u'unafraid', u'unafraid.a.01', 'also')
(u'fearless', u'unafraid.a.01', 'also')

How about using RitaWN instead of JAWS? I see it is listed as one of the available API in http://wordnet.princeton.edu/wordnet/related-projects/#Java
I also see it has getAllSynonyms() method, they provide an example and it works.
Check this out:
import java.io.IOException;
import java.util.Arrays;
import rita.*;
public class Synonyms {
public static void main(String[] args) throws IOException {
// Would pass in a PApplet normally, but we don't need to here
RiWordNet wordnet = new RiWordNet("C:\\Program Files (x86)\\WordNet\\2.1");
// Get a random noun
String word = "love";//wordnet.getRandomWord("n");
// Get max 15 synonyms
String[] synonyms = wordnet.getAllSynonyms(word, "v");
System.out.println("Random noun: " + word);
if (synonyms != null) {
// Sort alphabetically
Arrays.sort(synonyms);
for (int i = 0; i < synonyms.length; i++) {
System.out.println("Synonym " + i + ": " + synonyms[i]);
}
} else {
System.out.println("No synyonyms!");
}
}
Make sure to get the latest downloads and documentation from http://www.rednoise.org/rita/wordnet-old/documentation/index.htm

Related

I cannot extract the text from an element using ElementTree

A snippet of my document and the code is as follows:
import xml.etree.ElementTree as ET
obj = ET.fromstring("""
<tab>
<infos><bounds left="7947" top="88607" width="10086" height="1184" bottom="89790" right="18032" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="115" top="0" width="9300" height="1169" bottom="1168" right="9414"/> </infos>
<row > <infos> <bounds left="8062" top="88607" width="9300" height="524" bottom="89130" right="17361" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="0" top="0" width="9300" height="524" bottom="523" right="9299"/> </infos>
<cell ptr="000002232E644270" id="199" symbol="class SwCellFrame" next="202" upper="198" lower="200" rowspan="1"> <infos> <bounds left="8062" top="88607" width="546" height="524" bottom="89130" right="8607" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="7" top="15" width="532" height="509" bottom="523" right="538"/> </infos>
<txt> <infos> <bounds left="8069" top="88622" width="532" height="187" bottom="88808" right="8600" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="0" top="3" width="532" height="184" bottom="186" right="531"/> </infos>
<Finish/>
</txt>
<txt> <infos> <bounds left="8069" top="88809" width="532" height="149" bottom="88957" right="8600" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="136" top="0" width="396" height="149" bottom="148" right="531"/> </infos>
UDA <Finish/>
</txt>
</cell>
<cell ptr="000002232E642E40" id="202" symbol="class SwCellFrame" next="205" prev="199" upper="198" lower="203" rowspan="1"> <infos> <bounds left="8608" top="88607" width="3283" height="524" bottom="89130" right="11890" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="7" top="15" width="3269" height="509" bottom="523" right="3275"/> </infos>
<txt>
<infos> <bounds left="8615" top="88622" width="3269" height="180" bottom="88801" right="11883" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="0" top="7" width="3269" height="173" bottom="179" right="3268"/> </infos> <Finish/>
</txt>
<txt> <infos> <bounds left="8615" top="88802" width="3269" height="149" bottom="88950" right="11883" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="58" top="0" width="3170" height="149" bottom="148" right="3227"/> </infos>
Nombre <Finish/>
</txt>
</cell>
</row>
</tab>
""")
a = obj.findall('./row/cell/txt')
for i, item in enumerate(a):
print(i, item.text.strip())
But if I simplify the document, I do manage to extract the text,
obj = ET.fromstring("""
<tab>
<row>
<cell >
<txt > <Finish/> </txt>
<txt > UDA <Finish/> </txt>
</cell>
<cell >
<txt > <Finish/> </txt>
<txt > Nombre <Finish/> </txt>
</cell>
</row>
</tab>
""")
a = obj.findall('./row/cell/txt')
for i, item in enumerate(a):
print(i, item.text.strip())
0
1 UDA
2
3 Nombre
I don't know how to solve this problem, because my working document is very large and I can't simplify it as I have done in this example.
The "UDA" and "Nombre" strings are found in the tail of infos elements. The easiest way to get the wanted output is to use itertext():
a = obj.findall('./row/cell/txt')
for i, item in enumerate(a):
text = "".join([s.strip() for s in item.itertext()])
print(i, text)

Extract tag content from XML using beautifulsoup

Thanks to the many threads I found here, I managed to do some of what I wanted to do. But now I'm stuck. Help would be appreciated.
So I have this XML file of a few thousand records, from which I want to extract
The contents of tag 520 (URL)
The contents of tag 001 (recno) wherever a tag 520 was found
--> So the result should be a list of URLs + recnos.
Bonus points for helping me export the subsequent result to a csv instead of showing it onscreen ;)
# Import BeautifulSoup
from bs4 import BeautifulSoup as bs
content = []
# Read the XML file
with open("snippet_bilzen.xml", "r") as file:
# Read each line in the file, readlines() returns a list of lines
content = file.readlines()
# Combine the lines in the list into a string
content = "".join(content)
bs_content = bs(content, "lxml")
#Get contents of tag 520
rows_url = bs_content.find_all(tag="520")
for row in rows_url: # Print all occurrences
print(row.get_text())
# trying to get contents of tag 001 where 520 occurs
rows_id = bs_content.find_all(tag="001")
for row in rows_id:
print(row.get_text())
This is an piece of the xml :
<record>
<leader>00983nam a2200000 c 4500</leader>
<controlfield tag="001">c:obg:160033</controlfield>
<controlfield tag="005">20180605143926.1</controlfield>
<controlfield tag="008">060214s1987 xx u und </controlfield>
<datafield ind1="3" ind2=" " tag="024">
<subfield code="a">0075992557726</subfield>
</datafield>
<datafield ind1="1" ind2="0" tag="245">
<subfield code="a">Sign 'O' the times</subfield>
</datafield>
<datafield ind1="#" ind2="#" tag="260">
<subfield code="b">Paisley Park</subfield>
<subfield code="c">1987</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="300">
<subfield code="a">2 cd's</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="306">
<subfield code="a">01:19:51</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="340">
<subfield code="a">cd</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="500">
<subfield code="a">Met teksten</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="520">
<subfield code="a">ill</subfield>
<subfield code="u">http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.95.131.jpg.rm99991231.51210.17208</subfield>
</datafield>
</record>
<record>
<leader>00854nam a2200000 c 4500</leader>
<controlfield tag="001">c:obg:157417</controlfield>
<controlfield tag="005">20180725100810.1</controlfield>
<controlfield tag="008">060214s1984 xx u und </controlfield>
<datafield ind1="3" ind2=" " tag="024">
<subfield code="a">0042282289827</subfield>
</datafield>
<datafield ind1="3" ind2=" " tag="024">
<subfield code="a">4007196101944</subfield>
</datafield>
<datafield ind1="2" ind2=" " tag="024">
<subfield code="a">JKX0823</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="028">
<subfield code="a">IMCD 236/822 898-2</subfield>
</datafield>
<datafield ind1="1" ind2="3" tag="245">
<subfield code="a">The unforgettable fire</subfield>
</datafield>
<datafield ind1="#" ind2="#" tag="260">
<subfield code="b">Island Records</subfield>
<subfield code="c">1984</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="300">
<subfield code="a">1 cd</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="306">
<subfield code="a">00:42:48</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="340">
<subfield code="a">cd</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="520">
<subfield code="a">ill</subfield>
<subfield code="u">http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.31.88.jpg.rm99991231.19959.13742</subfield>
</datafield>
</record>
Try this.
from simplified_scrapy import SimplifiedDoc,req,utils
html = utils.getFileContent('snippet_bilzen.xml')
doc = SimplifiedDoc(html)
rows_url = doc.selects('#tag=520').select('#code=u').text
rows_id = doc.selects('#tag=001').text
print (rows_url)
print (rows_id)
Result:
['http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.95.131.jpg.rm99991231.51210.17208', 'http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.31.88.jpg.rm99991231.19959.13742']
['c:obg:160033', 'c:obg:157417']
If I understand you right, you want to get data only from records where elements with tag="520" and tag="001" are present:
from bs4 import BeautifulSoup
with open('snippet_bilzen.xml', 'r') as f_in:
soup = BeautifulSoup(f_in.read(), 'html.parser')
data = []
for record in soup.select('record:has([tag="520"] > [code="u"]):has([tag="001"])'):
tag_520 = record.select_one('[tag="520"] > [code="u"]') # select URL
tag_001 = record.select_one('[tag="001"]') # select tag="001"
data.append([tag_520.get_text(strip=True), tag_001.get_text(strip=True)])
print(data)
Prints:
[['http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.95.131.jpg.rm99991231.51210.17208', 'c:obg:160033'],
['http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.31.88.jpg.rm99991231.19959.13742', 'c:obg:157417']]

Python - pandas datetime column with multiple timezones

I have a data frame with multiple users and timezones, like such:
cols = ['user', 'zone_name', 'utc_datetime']
data = [
[1, 'Europe/Amsterdam', pd.to_datetime('2019-11-13 11:14:15')],
[2, 'Europe/London', pd.to_datetime('2019-11-13 11:14:15')],
]
df = pd.DataFrame(data, columns=cols)
Based on this other post, I apply the following change to get the user local datetime:
df['local_datetime'] = df.groupby('zone_name')[
'utc_datetime'
].transform(lambda x: x.dt.tz_localize(x.name))
Which outputs this:
user zone_name utc_datetime local_datetime
1 Europe/Amsterdam 2019-11-13 11:14:15 2019-11-13 11:14:15+01:00
2 Europe/London 2019-11-13 11:14:15 2019-11-13 11:14:15+00:00
However, the local_datetime column is an object and I cannot find a way to get it as datetime64[ns] and in the following format (desired output):
user zone_name utc_datetime local_datetime
1 Europe/Amsterdam 2019-11-13 11:14:15 2019-11-13 12:14:15
2 Europe/London 2019-11-13 11:14:15 2019-11-13 11:14:15
I think you need Series.dt.tz_convert in lambda function:
df['local_datetime'] = (pd.to_datetime(df.groupby('zone_name')['utc_datetime']
.transform(lambda x: x.dt.tz_localize('UTC').dt.tz_convert(x.name))
.astype(str).str[:-6]))
print(df)
user zone_name utc_datetime local_datetime
0 1 Europe/Amsterdam 2019-11-13 11:14:15 2019-11-13 12:14:15
1 2 Europe/London 2019-11-13 11:14:15 2019-11-13 11:14:15
Relatively shorter answer using DataFrame.apply:
df['local_datetime'] = df.apply(lambda x: x.utc_datetime.tz_localize(tz = "UTC").tz_convert(x.zone_name), axis = 1)
print(df)
user zone_name utc_datetime local_datetime
0 1 Europe/Amsterdam 2019-11-13 11:14:15 2019-11-13 12:14:15+01:00
1 2 Europe/London 2019-11-13 11:14:15 2019-11-13 11:14:15+00:00
If you want to remove the time zone information, you can localize times by passing None
df['local_datetime'] = df.apply(lambda x: x.utc_datetime.tz_localize(tz = "UTC").tz_convert(x.zone_name).tz_localize(None), axis = 1)
print(df)
user zone_name utc_datetime local_datetime
0 1 Europe/Amsterdam 2019-11-13 11:14:15 2019-11-13 12:14:15
1 2 Europe/London 2019-11-13 11:14:15 2019-11-13 11:14:15

How to use tabulate in python3 dictionary while using loops

I'm trying to print the dictionary data into a tabular form , for now i see tabulate module as a easy way to test but somehow the data i'm getting thats coming the good way but the header informaion is repeating on each run for the user ID, please guide or suggest how to do that.....
$ cat checktable.py
#!/usr/bin/python3
import subprocess
import pandas as pd
from tabulate import tabulate
def CheckUid(user):
proc = subprocess.Popen("ldapsearch -h ldapserver -D 'cn=directory manager' -w pass123 -LLLb 'ou=people,o=rraka.com' 'uid=%s' managerlogin" % (user), shell=True, stdout=subprocess.PIPE)
info_str = proc.stdout.read().decode('utf8')
split_str = info_str.split()
if len(split_str) > 1:
raw_data = {'UserID': [split_str[1].split(',')[0].split('=')[1]], 'MangerID': [split_str[-1]]}
headers = ["UserID", "MangerID"]
return tabulate(raw_data, headers, tablefmt="simple")
else:
split_str = 'null'
def CallUid():
with open('hh', mode='rt', encoding='utf-8') as f:
for line in f.readlines():
print(CheckUid(line))
if __name__ == '__main__':
CallUid()
This returns the below data:
$ ./checktable.py
UserID MangerID
-------- ----------
aashishp rpudota
UserID MangerID
-------- ----------
abaillie davem
UserID MangerID
-------- ----------
abishek kalyang
UserID MangerID
Expected output:
$ ./checktable.py
UserID MangerID
-------- ----------
aashishp rpudota
abaillie davem
abishek kalyang
Another alternative code:
#!/usr/bin/python3
import sys
import subprocess
from tabulate import tabulate
def CheckUid(user):
proc = subprocess.Popen("ldapsearch -h its3 -D 'cn=directory manager' -w JatetRE3 -LLLb 'ou=people,o=cadence.com' 'uid=%s' managerlogin" % (user), shell=True, stdout=subprocess.PIPE)
info_str = proc.stdout.read().decode('utf8')
split_str = info_str.split()
if len(split_str) > 1:
raw_data = {'UserID': split_str[1].split(',')[0].split('=')[1], 'Manger': split_str[-1]}
for key, value in raw_data.items():
print(key, ":", value)
else:
split_str = 'null'
def CallUid():
with open('hh', mode='rt', encoding='utf-8') as f:
for line in f.readlines():
CheckUid(line)
if __name__ == '__main__':
CallUid()
It comes as below, where i need every two line two be into one..
$ ./checktable2.py
UserID : aashishp
Manger : rpudota
UserID : abaillie
Manger : davem
While desired would be:
$ ./checktable2.py
UserID : aashishp Manger : rpudota
UserID : abaillie Manger : davem
After Struggling as a learner, I came around with the below variant of codes as a solution to my own questions:
1) The First code is using the pandas module:
$ cat check_ldapUserdata.py
#!/usr/bin/python3
import pandas as pd
import subprocess
user_list = []
mngr_list = []
def CheckUid(user):
proc = subprocess.Popen("ldapsearch -h ldapserver -D 'cn=directory manager' -w JatetRE3 -LLLb 'ou=people,o=rraka.com' 'uid=%s' managerlogin" % (user), shell=True, stdout=subprocess.PIPE)
info_str = proc.stdout.read().decode('utf8')
split_str = info_str.split()
if len(split_str) > 1:
user = split_str[1].split(',')[0].split('=')[1]
manager = split_str[-1]
user_list.append(user)
mngr_list.append(manager)
else:
split_str = 'null'
def DataList():
df = pd.DataFrame({'User':user_list, 'Manager':mngr_list})
df = df[['User', 'Manager']] # To keep the order of columns
#return df
print(df)
def CallUid():
with open('testu', mode='rt', encoding='utf-8') as f:
for line in f.readlines():
CheckUid(line)
if __name__ == '__main__':
CallUid()
DataList()
Result Output is as Follows...
$ ./check_ldapUserdata.py
User Manager
0 karn benjamin
1 niraj vikashg
2 vaithees benjamin
3 mauj benjamin
2) The another way I achived it with using Regular Expression &
BeautifulTable module to get the table Format..
$ cat check_ldapUserdata2.py
#!/usr/bin/python3
import re
import subprocess
from beautifultable import BeautifulTable
table = BeautifulTable()
table.column_headers = ["User", "Manager"]
def CheckUid(user):
proc = subprocess.Popen("ldapsearch -h ldapserver -D 'cn=directory manager' -w pass123 -LLLb 'ou=people,o=rraka.com' 'uid=%s' managerlogin" % (user), shell=True, stdout=subprocess.PIPE)
info_str = proc.stdout.read().decode('utf8')
pat_match = re.match(".*uid=(.*?)\,.*\nmanagerlogin:\s+(.*)",info_str)
if pat_match:
table.append_row([pat_match.group(1), pat_match.group(2)])
def CallUid():
input_file=input("Please enter the file name : ")
with open(input_file, mode='rt', encoding='utf-8') as f:
for line in f.readlines():
CheckUid(line)
print(table)
if __name__ == '__main__':
CallUid()
Result Output as as below....
$ ./check_ldapUserdata2.py
Please enter the file name : testu
+----------+----------+
| User | Manager |
+----------+----------+
| karn | benjamin |
+----------+----------+
| niraj | vikashg |
+----------+----------+
| vaithees | benjamin |
+----------+----------+
| mauj | benjamin |
+----------+----------+
3) Another Simple non tabular Form but working ..
$ cat check_table_working1.py
#!/usr/bin/python3
import subprocess
def CheckUid(user):
proc = subprocess.Popen("ldapsearch -h ldapserver -D 'cn=directory manager' -w pass123 -LLLb 'ou=people,o=rraka.com' 'uid=%s' managerlogin" % (user), shell=True, stdout=subprocess.PIPE)
info_str = proc.stdout.read().decode('utf8')
split_str = info_str.split()
if len(split_str) > 1:
raw_data = {split_str[1].split(',')[0].split('=')[1] : split_str[-1]}
#raw_data = {'UserID': split_str[1].split(',')[0].split('=')[1], 'Manger': split_str[-1]}
for key, value in raw_data.items():
#print(key, ":", value)
print('{} : {}'.format(key, value))
else:
split_str = 'null'
def CallUid():
with open('hh', mode='rt', encoding='utf-8') as f:
for line in f.readlines():
CheckUid(line)
if __name__ == '__main__':
CallUid()
Result output of the above is as below...
$ ./check_table_working1.py
aashishp : rpudota
abaillie : davem
abishek : kalyang
adik : venky
adithya : jagi

Rotate SVG without misplacing the image

I have this svg "code"
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="34">
<path transform="rotate(0 10 17)" fill="#6699FF" d="M 0.06,25.03 C 0.06,25.03 4.06,20.97 4.06,20.97 4.06,20.97 3.97,5.06 3.97,5.06 3.97,5.06 9.94,0.03 9.94,0.03 9.94,0.03 15.78,5.09 15.78,5.09 15.78,5.09 16.03,21.06 16.03,21.06 16.03,21.06 19.94,25.00 19.94,25.00 19.94,25.00 19.97,30.09 19.97,30.09 19.97,30.09 10.06,20.97 10.06,20.97 10.06,20.97 10.05,23.00 10.05,23.00 10.05,23.00 19.91,32.03 19.91,32.03 19.91,32.03 19.94,33.97 19.94,33.97 19.94,33.97 17.97,33.97 17.97,33.97 17.97,33.97 10.03,26.94 10.03,26.94 10.03,26.94 10.05,29.01 10.05,29.01 10.05,29.01 15.94,34.00 15.94,34.00 15.94,34.00 12.00,33.94 12.00,33.94 12.00,33.94 10.00,32.00 10.00,32.00 10.00,32.00 8.00,34.03 8.00,34.03 8.00,34.03 4.00,34.00 4.00,34.00 4.00,34.00 9.97,29.01 9.97,29.01 9.97,29.01 9.97,26.94 9.97,26.94 9.97,26.94 2.00,33.96 2.00,33.96 2.00,33.96 0.00,33.98 0.00,33.98 0.00,33.98 0.02,32.00 0.02,32.00 0.02,32.00 9.96,23.00 9.96,23.00 9.96,23.00 9.95,20.98 9.95,20.98 9.95,20.98 0.00,30.00 0.00,30.00 0.00,30.00 0.06,25.03 0.06,25.03 Z" />
<path transform="rotate(0 10 17) translate(0 0)" fill="white" d="M 8.04,17.04 C 8.04,17.04 7.98,11.04 7.98,11.04 7.98,11.04 5.00,11.00 5.00,11.00 5.00,11.00 10.00,6.04 10.00,6.04 10.00,6.04 14.96,11.02 14.96,11.02 14.96,11.02 12.02,11.04 12.02,11.04 12.02,11.04 12.02,17.04 12.02,17.04 12.02,17.04 8.04,17.04 8.04,17.04 Z" />
</svg>
its a vessel with a arrow inside.
I am trying to rotate this "picture", (i need to rotate it on 360 degree).
When I change rotate(0 10 17) to rotate(90 10 17) it gets cut away. That's because i don't rotate it from the center of the image.
I tried using this formula to calculate the center but i couldn't manage to do it myself
x = 34; y = 20; o = 4.71238898 //(degrees to radiants) ;
a = Math.abs(x * Math.sin(o)) + Math.abs(y * Math.cos(o));
b = Math.abs(x * Math.cos(o)) + Math.abs(y * Math.sin(o));
I am really bad with those math/svg problems, i am hoping someone can assist me.
Thanks
The centre of rotation is correct. But the trouble is now that it is rotated, your graphic isn't 20x34 any longer, it is 34x20.
So the first thing you have to do is update the width and height.
<svg xmlns="http://www.w3.org/2000/svg" width="34" height="20">
That's not the final solution of course, because the centre of this new 34x20 SVG is in a different place to the centre of the old 20x34 one. One solution would be to work out a different centre of rotation that would work so that the graphic rotated around into the right position in the new rectangle.
That's a bit tricky. Luckily there is a much simpler way. We can just add a viewBox to the SVG to tell the browser the dimensions of the area that the rotated symbol occupies. The browser will reposition it for us.
The four values in a viewBox attribute are:
<leftX> <topY> <width> <height>
We already know the width and height (34 and 20), so we just need to work out the left and top coords. Those are obviously just the centre-of-rotation less half the width and height respectively.
leftX = 10 - (newWidth / 2)
= 10 - 17
= -7
topY = 17 - (newHeight / 2)
= 17 - 10
= 7
So the viewBox attribute needs to be "-7 7 34 20".
<svg xmlns="http://www.w3.org/2000/svg" width="34" height="20" viewBox="-7 7 34 20">
<path transform="rotate(90 10 17)" fill="#6699FF" d="M 0.06,25.03 C 0.06,25.03 4.06,20.97 4.06,20.97 4.06,20.97 3.97,5.06 3.97,5.06 3.97,5.06 9.94,0.03 9.94,0.03 9.94,0.03 15.78,5.09 15.78,5.09 15.78,5.09 16.03,21.06 16.03,21.06 16.03,21.06 19.94,25.00 19.94,25.00 19.94,25.00 19.97,30.09 19.97,30.09 19.97,30.09 10.06,20.97 10.06,20.97 10.06,20.97 10.05,23.00 10.05,23.00 10.05,23.00 19.91,32.03 19.91,32.03 19.91,32.03 19.94,33.97 19.94,33.97 19.94,33.97 17.97,33.97 17.97,33.97 17.97,33.97 10.03,26.94 10.03,26.94 10.03,26.94 10.05,29.01 10.05,29.01 10.05,29.01 15.94,34.00 15.94,34.00 15.94,34.00 12.00,33.94 12.00,33.94 12.00,33.94 10.00,32.00 10.00,32.00 10.00,32.00 8.00,34.03 8.00,34.03 8.00,34.03 4.00,34.00 4.00,34.00 4.00,34.00 9.97,29.01 9.97,29.01 9.97,29.01 9.97,26.94 9.97,26.94 9.97,26.94 2.00,33.96 2.00,33.96 2.00,33.96 0.00,33.98 0.00,33.98 0.00,33.98 0.02,32.00 0.02,32.00 0.02,32.00 9.96,23.00 9.96,23.00 9.96,23.00 9.95,20.98 9.95,20.98 9.95,20.98 0.00,30.00 0.00,30.00 0.00,30.00 0.06,25.03 0.06,25.03 Z" />
<path transform="rotate(90 10 17) translate(0 0)" fill="white" d="M 8.04,17.04 C 8.04,17.04 7.98,11.04 7.98,11.04 7.98,11.04 5.00,11.00 5.00,11.00 5.00,11.00 10.00,6.04 10.00,6.04 10.00,6.04 14.96,11.02 14.96,11.02 14.96,11.02 12.02,11.04 12.02,11.04 12.02,11.04 12.02,17.04 12.02,17.04 12.02,17.04 8.04,17.04 8.04,17.04 Z" />
</svg>
Update
If you need to do arbitrary angles, then there is a better method, using Javascript.
Apply the transform to the paths
Call getBBox() on the SVG to get the dimensions of its content.
Use those values to update the viewBox and the width and 'height`
var angle = 145; // degrees
var mysvg = document.getElementById("mysvg");
var paths = mysvg.getElementsByTagName("path");
// Apply a transform attribute to each path
for (var i=0; i<paths.length; i++) {
paths[i].setAttribute("transform", "rotate("+angle+",10,17)");
}
// Now that the paths have been rotated, get the bounding box
// of the SVG contents
var bbox = mysvg.getBBox();
// Update the viewBox from the bounds
mysvg.setAttribute("viewBox", bbox.x + " " + bbox.y + " " +
bbox.width +" "+ bbox.height);
// Update the width and height
mysvg.setAttribute("width", bbox.width + "px");
mysvg.setAttribute("height", bbox.height + "px");
<svg id="mysvg" xmlns="http://www.w3.org/2000/svg" width="20" height="34">
<path fill="#6699FF" d="M 0.06,25.03 C 0.06,25.03 4.06,20.97 4.06,20.97 4.06,20.97 3.97,5.06 3.97,5.06 3.97,5.06 9.94,0.03 9.94,0.03 9.94,0.03 15.78,5.09 15.78,5.09 15.78,5.09 16.03,21.06 16.03,21.06 16.03,21.06 19.94,25.00 19.94,25.00 19.94,25.00 19.97,30.09 19.97,30.09 19.97,30.09 10.06,20.97 10.06,20.97 10.06,20.97 10.05,23.00 10.05,23.00 10.05,23.00 19.91,32.03 19.91,32.03 19.91,32.03 19.94,33.97 19.94,33.97 19.94,33.97 17.97,33.97 17.97,33.97 17.97,33.97 10.03,26.94 10.03,26.94 10.03,26.94 10.05,29.01 10.05,29.01 10.05,29.01 15.94,34.00 15.94,34.00 15.94,34.00 12.00,33.94 12.00,33.94 12.00,33.94 10.00,32.00 10.00,32.00 10.00,32.00 8.00,34.03 8.00,34.03 8.00,34.03 4.00,34.00 4.00,34.00 4.00,34.00 9.97,29.01 9.97,29.01 9.97,29.01 9.97,26.94 9.97,26.94 9.97,26.94 2.00,33.96 2.00,33.96 2.00,33.96 0.00,33.98 0.00,33.98 0.00,33.98 0.02,32.00 0.02,32.00 0.02,32.00 9.96,23.00 9.96,23.00 9.96,23.00 9.95,20.98 9.95,20.98 9.95,20.98 0.00,30.00 0.00,30.00 0.00,30.00 0.06,25.03 0.06,25.03 Z" />
<path fill="white" d="M 8.04,17.04 C 8.04,17.04 7.98,11.04 7.98,11.04 7.98,11.04 5.00,11.00 5.00,11.00 5.00,11.00 10.00,6.04 10.00,6.04 10.00,6.04 14.96,11.02 14.96,11.02 14.96,11.02 12.02,11.04 12.02,11.04 12.02,11.04 12.02,17.04 12.02,17.04 12.02,17.04 8.04,17.04 8.04,17.04 Z" />
</svg>

Resources