Sparkly_Apply with Coxph function - apache-spark

I am trying to run coxph() survival function with Spark_Apply, but I am getting below error
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\XXXX\AppData\Local\Temp\1\Rtmpusw8Aw\file344868ef1e07_spark.log': Permission denied
I saw this error whenever I did gave a wrong R command, but in this case my coxph command working fine in R
Spark_Apply with Linear regression reference
https://spark.rstudio.com/
I know spark_apply just try to distribute the data as much as it can. In my question I want to know I am doing in mistake or spark_apply canot run the survival models or should I need to import survival function by any way to sparklr R.
R Code:
coxph(Surv(t_start, t_stop, t_trans)~1,data_input)
O/p:
Null model
log likelihood= -XXX
n= XX
Sparklr code:
data<-copy_to(sc,data_input)
spark_apply(
data,
function(e) coxph(Surv(t_start, t_stop, t_trans)~1,e),
)
O/p:
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\XXXX\AppData\Local\Temp\1\Rtmpusw8Aw\file344868ef1e07_spark.log': Permission denied

Related

bioMart Package error: error in function useDataset

I am trying to use the biomaRt package to access the data from Ensembl, however I got error message when using the useDataset() function. My codes are shown below.
library(httr)
listMarts()
ensembl = useMart("ENSEMBL_MART_ENSEMBL")
listDatasets(ensemble)
ensembl = useDataset("hsapiens_gene_ensembl",mart = ensemble)
When I type useDataset function i got error message like this:
> ensembl = useDataset("hsapiens_gene_ensembl",mart = ensembl)
Ensembl site unresponsive, trying asia mirror
Error in textConnection(text, encoding = "UTF-8") :
invalid 'text' argument
and sometimes another different error message showed as:
> ensembl = useDataset("hsapiens_gene_ensembl",mart = ensembl)
Ensembl site unresponsive, trying asia mirror
Error in textConnection(bmResult) : invalid 'text' argument
it seems like that the mirror automatically change to asia OR useast OR uswest, but the error message still shows up over and over again, and i don't know what to do.
So if anyone could help me with this? I will be very grateful for any help or suggestion.
Kind regards Riley Qiu, Dongguan, China

How to open syslog files in Python

I am trying an assignment to open a syslog for a server program (called ticky) that creates logs and errors and then assign the errors to a dictionary and export to a csv file to sort and host to a webpage,
I am unsure of how to access syslog files, as the course only went into sys.argv and I don't know if this can be used or if I need to figure out how to use syslog module. Once the log is opened, the regex will pull the error message and add it to the dictionary, either creating a new entry or adding value to an existing key.
Am I on the right track?
#!/usr/bin/env python3
import re
import sys
errors = {}
# log line format
# Jun 1 11:06:48 ubuntu.local ticky: ERROR: Connection to DB failed (username)
logfile = sys.argv[1]
# NOTE: Check to find correct log file
with open(logfile) as f:
for line in f:
if "ERROR:" not in line:
continue
regex_error = r"ERROR: (\d+) "
"""searches for error messages"""
error = re.search(regex_error, line)
if error is None:
continue
name = error[1]
errors[name] = errors.get(name, 0) + 1

Getting FileNotFoundError when calling python coded file from robot framework

I am trying to write an automation code for picking up different environment values for execution of my testcases based on the value I pass for environment.
Here is the code I tried :
# env.robot
*** Settings ***
Variables setup.py stage01
*** Test Cases ***
Print values
log to console ${data}
# setup.py
from robot.libraries.BuiltIn import BuiltIn
import xlrd
def get_variables(env):
file_location = "values.xlsx"
workbook = xlrd.open_workbook(file_location)
sheet = workbook.sheet_by_name(env)
print("Env : " + sheet.name)
data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)]
print(data)
BuiltIn().log_to_console(data)
return data
Response I am getting :
[ ERROR ] Error in file 'D:\env.robot': Processing variable file 'D:\setup2.py' with arguments [ stage01 ]
failed: FileNotFoundError: [Errno 2] No such file or directory: 'values.xlsx'
values.xlsx is present in the same directory with .py and .robot file
I want to get the data from values.xlsx file based on the value of env variable and use the values in robot testcases.
Please suggest what I need to modify or any other approach would also do.

PooledOLS in python

I have been trying to use panelOLS regression but I kept on getting errors, I have attached the lines of code and the error.
'mod=PanelOLS(data.FDI,data[['Trade openness','GDP growth']],time_effects=True)'
' res = mod.fit(cov_type='clustered', cluster_entity=True)`
error messages

Error in `python3': free(): invalid pointer

I'm trying to read all the csv files in 2 directories using glob module:
import os
import pandas as pd
import glob
def get_list_of_group_df(filepath):
all_group_df_list = []
groups_path = filepath
for file in glob.glob(groups_path):
name = os.path.basename(file)
name = patient_name.partition('_raw')[0]
with open(file, 'r') as name_vcf:
group_vcf_to_df = pd.read_csv(name_vcf, delimiter='\t',
header=0, index_col=False, low_memory=False,
usecols=['A', 'B', 'C', 'D'])
group_df_wo_duplicates = group_vcf_to_df.drop_duplicates()
group_df = group_df_wo_duplicates.reset_index(drop=True)
group_df['group_name'] = name
all_group_df_list.append(group_df)
return all_group_df_list
def get_freq():
group_filepath_dict =
{'1_group':"/home/Raw_group/*.tsv",
'2_group':"/home/Raw_group/*.tsv"}
for group, filepath in group_filepath_dict.items():
print(get_list_of_group_df(filepath))
get_freq()
When I run this script locally, it works just fine. However, running it on UBUNTU server gives me the following error message:
Error in `python3': free(): invalid pointer: 0x00007fcc970d76be ***
Aborted (core dumped)
I'm using python 3.6.3 version. Can anyone tell me how to solve the problem?
I have a similar problem in Python 3.7.3 under Raspbian Buster 2020-02-13. My program dies with free(): invalid pointer except no pointer is given and there is no core dump and no stack trace. So, I have nothing to debug with. This has happened a few times, usually after the program has been running for a day or two, so I suspect it's a very slow memory leak or a very infrequent intermittent bug in Python garbage collection. I am not doing any memory management myself.

Resources