from spacy.symbols import amod, prep, nsubj, csubj, dobj, iobj, acomp, attr
from spacy.symbols import NN, NNS, JJ, JJS, JJR, conj
MR = [amod, prep, nsubj, csubj, dobj, iobj, acomp, attr]
nn = [NN, NNS]
jj = [JJ, JJS, JJR]
CONJ = [conj]
target = set()
opinion_word = ['great']
for each_sent in list(doc.sents):
for word in each_sent:
if word in opinion_word and word.dep in MR and word.head.pos in nn:
target.add(word.head)
Hello
I know this question has been posted but I didn’t find a suitable answer for my problem.
I would like to subset all the modules imported to use them in if statement as shown in my code.
Any suggestions?
hi i found out the trick for my problem.
I just had to import each attribute from the module as "a variable"
from spacy.symbols import amod as a, prep as b, nsubj as c, acomp as d ...
and then i can create the list of the attribute by using the variables
MR = [a, b, c, d]
and use my MR in my if statement.
Related
As part of a school assignment on DSL and code generation, I have to translate the following program written in Python/Scikit-learn into R language (the topic of the exercise is an hypothetic Machine Learning DSL).
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_validate
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
df = pd.read_csv('boston.csv', sep=',')
df.head()
y = df["medv"]
X = df.drop(columns=["medv"])
clf = DecisionTreeRegressor()
scoring = ['neg_mean_absolute_error','neg_mean_squared_error']
results = cross_validate(clf, X, y, cv=6,scoring=scoring)
print('mean_absolute_errors = '+str(results['test_neg_mean_absolute_error']))
print('mean_squared_errors = '+str(results['test_neg_mean_squared_error']))
Since I'm a perfect newbie in Machine Learning, and especially in R, I can't do it.
Could someone help me ?
Sorry for the late answer, probably you have already finished your school assignment. Of course we cannot just do it for you, you probably have to figure it out by yourself. Moreover, I don't get exactly what you need to do. But some tips are:
Read a csv file
data <-read.csv(file="name_of_the_file", header=TRUE, sep=",")
data <-as.data.frame(data)
The header=TRUE indicates that the file has one row which includes the names of the columns, the sep=',' is the same as in python (the seperator in the file is ',')
The as.data.frame makes sure that your data is kept in a dataframe format.
Add/delete a column
data<- data[,-"name_of_the_column_to_be_deleted"] #delete a column
data$name_of_column_to_be_added<- c(1:10) #add column
In order to add a column you will need to add the elements it will include. Also the # symbol indicates the beginning of a comment.
Modelling
For the modelling part I am not sure about what you want to achieve, but R offers a huge selection of algorithms to choose from (i.e. if you want to grow a tree take a look into the page https://www.statmethods.net/advstats/cart.html where it uses the following script to grow a tree
fit <- rpart(Kyphosis ~ Age + Number + Start,
method="class", data=kyphosis))
I am sure that this is a basic task and that the answer is somewhere on google but the problem I have is that I don't know what this is "called" so I am having a bad time trying to google it, almost every page demonstrates merging two lists which is not in my interest.
I basically have two lists where I would like to add the values from the list "add" to the the end of each word in the list "stuff" and print it out.
add = ['123', '12345']
stuff = ['Cars', 'Suits', 'Drinks']
Desired output
Cars123
Cars12345
Suits123
Suits12345
Drinks123
Drinks12345
Thanks in advance, and sorry again for bad research.
Is there any reason you can't just use a nested loop? It's certainly the simplest solution.
for i in stuff:
for j in add:
print(i+j)
gives
Cars123
Cars12345
Suits123
Suits12345
Drinks123
Drinks12345
This assumes that both lists are strings.
As a side point, shadowing function names like add is generally a bad idea for variables, so I would consider changing that.
Ignore what I said about combinations in the comment!
>>> from itertools import product
>>> add = ['123', '12345']
>>> stuff = ['Cars', 'Suits', 'Drinks']
>>> for a, s in product(add, stuff):
... a+s
...
'123Cars'
'123Suits'
'123Drinks'
'12345Cars'
'12345Suits'
'12345Drinks'
Addendum: Timing information: This code, which compares the nested loop with the product function from itertools does indeed show that the latter takes more time, in the ratio of about 2.64.
import timeit
def approach_1():
add = ['123', '12345']; stuff = ['Cars', 'Suits', 'Drinks']
for a in add:
for s in stuff:
a+s
def approach_2():
from itertools import product
add = ['123', '12345']; stuff = ['Cars', 'Suits', 'Drinks']
for a, s in product(add, stuff):
a+s
t1 = timeit.timeit('approach_1()','from __main__ import approach_1', number=10000)
t2 = timeit.timeit('approach_2()','from __main__ import approach_2', number=10000)
print (t2/t1)
You need two for loops for that:
for stuff_element in stuff:
for add_element in add:
print(stuff_element+add_elements)
Try this :
for i in stuff:
for j in add:
print(i+j)
Let me know if it works
Is there a way to calculate log base c in python?
c is a variable and may change due to some dependencies.
I am new to programming and also python3.
There is already a built in function in the math module in python that does this.
from math import log
def logOc(c, num):
return log(num,c)
print(log(3,3**24))
You can read more about log and the python math module here
Yes, you can simply use math's function log():
import math
c = 100
val = math.log(10000,c) #Where the first value is the number and the second the base.
print(val)
Example:
print(val)
2.0
How would I go about extracting the morphologically related verb(s) for some noun.
So, for example, I would like to be able to build some function like this (using nltk):
related_verb('decision') -> 'decide'
related_verb('walk') -> 'walk'
related_verb('shower') -> 'shower'
related_verb('exclusion') -> 'exclude'
This is really simple to do using the '-derin' command (http://wordnet.princeton.edu/man/wn.1WN.html#toc). But I can't seem to be able to do the same thing with nltk. Does anyone have any ideas?
Thanks!
Perhaps this could help:
Get lemma:
from nltk.corpus import wordnet as wn
lem = wn.lemmas('exclusion')[0]
print lem
>>> Lemma('exclusion.n.01.exclusion')
Get related forms:
related_forms = lem.derivationally_related_forms()
print related_forms
>>> [Lemma('bar.v.01.exclude'), Lemma('exclude.v.02.exclude')]
Get names of related verb lemmas:
print [related_form.name for related_form in related_forms
if related_form.synset.pos == 'v']
>>> ['exclude', 'exclude']
how can a Dataframe be converted to a SpatialGridDataFrame using the R maptools library? I am new to Rpy2, so this might be a very basic question.
The R Code is:
coordinates(dataf)=~X+Y
In Python:
import rpy2
import rpy2.robjects as robjects
r = robjects.r
# Create a Test Dataframe
d = {'TEST': robjects.IntVector((221,412,332)), 'X': robjects.IntVector(('25', '31', '44')), 'Y': robjects.IntVector(('25', '35', '14'))}
dataf = robjects.r['data.frame'](**d)
r.library('maptools')
# Then i could not manage to write the above mentioned R-Code using the Rpy2 documentation
Apart this particular question i would be pleased to get some feedback on a more general idea: My final goal would be to make regression-kriging with spatial data using the gstat library. The R-script is working fine, but i would like to call my Script from Python/Arcgis. What do you think about this task, is this possible via rpy2?
Thanks a lot!
Richard
In some cases, Rpy2 is still unable to dynamically (and automagically) generate smart bindings.
An analysis of the R code will help:
coordinates(dataf)=~X+Y
This can be more explicitly written as:
dataf <- "coordinates<-"(dataf, formula("~X+Y"))
That last expression makes the Python/rpy2 straigtforward:
from rpy2.robjects.packages import importr
sp = importr('sp') # "coordinates<-()" is there
from rpy2.robjects import baseenv, Formula
maptools_set = baseenv.get('coordinates<-')
dataf = maptools_set(dataf, Formula(' ~ X + Y'))
To be (wisely) explicit about where "coordinates<-" is coming from, use:
maptools_set = getattr(sp, 'coordinates<-')