How to count how many instances of a list's items occur in a column with multiple string entries per row?

How to count how many instances of a list's items occur in a column with multiple string entries per row? - python-3.x

I have a dataframe column signified df['Q2'] with all the responses from a users survey for the question 'which indicators do you use'? There are 36 statistical indicators users can choose from and one additional answer which is simply 'All of the indicators'. There are multiple string answers in each row as an artefact of the survey designed by someone else.
How do I simply pattern match/cross refer between respondents' answers of multiple indicators across multiple lines in the rows of column 2, with the indicators in my neat list?
indicators = ["All of the indicators",
"A1 / Eng13",
"A2 / Eng14",
"Eng14b",
"A4 / Eng17",
"A5",
"Eng16",
"B1a / Eng22a",
"B1b / Eng22b",
"B2 / Eng23",
"B4 / Eng18",
"B5a / Eng19a",
"B5b / Eng19b",
"B6 / Eng20",
"B7 / Eng21",
"C1 / Eng1",
"C2 / Eng3",
"Eng2a",
"C3a / Eng2b",
"C3b / Eng4c",
"C4a / Eng4a",
"C4b / Eng4b",
"C5",
"C6",
"C7",
"C8",
"Eng5",
"Eng6",
"Eng7",
"Eng8",
"C9a / Eng12a",
"C9b / Eng12b",
"D1a / Eng11",
"D1b / Eng9",
"D1c / Eng10",
"E1 / Eng24",
"E2 / Eng15"]
indicators_mentioned = df['Q2'] #all the responses in column 2 of dataframe
indicators_mentioned_as_string = indicators_mentioned.to_string() # convert responses to string
And regex/pattern matching attempt:
indicator_regex = re.compile(r'All of the indicators') #set a pattern to match for each indicator
instances_of_all_indicators = indicator_regex.findall(indicators_mentioned_as_string)
#assigns a variable to store all instances of indicators returned by find all function which finds all #instances within the bracketed dataset
instances_of_all_indicators_summed = (len(instances_of_all_indicators)) #list that is 6 items long contains 6 instances!
type(instances_of_all_indicators_summed) #check the value in the variable is int to go into the dictionary below as value
print(instances_of_all_indicators_summed)
#regex for second indicator
indicator_regex = re.compile(r'A1 / Eng13')
instances_of_A1 = indicator_regex.findall(indicators_mentioned_as_string)
instances_of_A1_summed = len(instances_of_A1)
print(instances_of_A1_summed)
I would like all the responses per indicator to go into a dictionary from which I can then make a nice chart.
indicator_by_reponse = {
"All of the indicators": instances_of_all_indicators_summed,
"A1 / Eng13": instances_of_A1_summed,
"A2 / Eng14": instances_of_A2_summed,
"Eng14b": instances_of_Eng14b_summed,
# "A4 / Eng17": instances_of_A4_summed,
# "A5": instances_of_A5_summed,
# "Eng16": instances_of_Eng16_summed,
# "B1a / Eng22a": instances_of_B1a_summed,
# "B1b / Eng22b": instances_of_B1b_summed,
# "B2 / Eng23": instances_of_B2_summed,
# "B4 / Eng18": instances_of_B4_summed,
# "B5a / Eng19a": instances_of_B5a_summed,
# "B5b / Eng19b": instances_of_B5b_summed,
# "B6 / Eng20": instances_of_B6_summed,
# "B7 / Eng21": instances_of_B7_summed,
# "C1 / Eng1": instances_of_C1_summed,
# "C2 / Eng3": instances_of_C2_summed,
# "Eng2a": instances_of_Eng2a_summed,
# "C3a / Eng2b": instances_of_C3a_summed,
# "C3b / Eng4c": instances_of_C3b_summed,
# "C4a / Eng4a": instances_of_C4a_summed,
# "C4b / Eng4b": instances_of_C4b_summed,
# "C5": instances_of_C5_summed,
# "C6": instances_of_C6_summed,
# "C7": instances_of_C7_summed,
# "C8": instances_of_C8_summed,
# "Eng5": instances_of_Eng5_summed,
# "Eng6": instances_of_Eng6_summed,
# "Eng7": instances_of_Eng7_summed,
# "Eng8": instances_of_Eng8_summed,
# "C9a / Eng12a": instances_of_C9a_summed,
# "C9b / Eng12b": instances_of_C9b_summed,
# "D1a / Eng11": instances_of_D1a_summed,
# "D1b / Eng9": instances_of_D1b_summed,
# "D1c / Eng10": instances_of_D1c_summed,
# "E1 / Eng24": instances_of_E1_summed,
# "E2 / Eng15": instances_of_E2_summed,
}
I converted my pandas series to a string and then used findall() but that still returns the responses in a messy way when what I want to do is group instances of the string for each indicator in the list to make a neat dictionary and a chart. To pattern match I used regex after importing re but this is a completely inelegant way of doing it which requires me to manually count the responses to put into a dictionary; there must be a simpler way.

Related

Add suffix to a specific row in pandas dataframe

Im trying to add suffix to % Paid row in the dataframe, but im stuck with only adding suffix to the column names.
is there a way i can add suffix to a specific row values,
Any suggestions are highly appreciated.
d={
("Payments","Jan","NOS"):[],
("Payments","Feb","NOS"):[],
("Payments","Mar","NOS"):[],
}
d = pd.DataFrame(d)
d.loc["Total",("Payments","Jan","NOS")] = 9991
d.loc["Total",("Payments","Feb","NOS")] = 3638
d.loc["Total",("Payments","Mar","NOS")] = 5433
d.loc["Paid",("Payments","Jan","NOS")] = 139
d.loc["Paid",("Payments","Feb","NOS")] = 123
d.loc["Paid",("Payments","Mar","NOS")] = 20
d.loc["% Paid",("Payments","Jan","NOS")] = round((d.loc["Paid",("Payments","Jan","NOS")] / d.loc["Total",("Payments","Jan","NOS")])*100)
d.loc["% Paid",("Payments","Feb","NOS")] = round((d.loc["Paid",("Payments","Feb","NOS")] / d.loc["Total",("Payments","Feb","NOS")])*100)
d.loc["% Paid",("Payments","Mar","NOS")] = round((d.loc["Paid",("Payments","Mar","NOS")] / d.loc["Total",("Payments","Mar","NOS")])*100)
without suffix
I tried this way, it works but.. im looking for adding suffix for an entire row..
d.loc["% Paid",("Payments","Jan","NOS")] = str(round((d.loc["Paid",("Payments","Jan","NOS")] / d.loc["Total",("Payments","Jan","NOS")])*100)) + '%'
d.loc["% Paid",("Payments","Feb","NOS")] = str(round((d.loc["Paid",("Payments","Feb","NOS")] / d.loc["Total",("Payments","Feb","NOS")])*100)) + '%
d.loc["% Paid",("Payments","Mar","NOS")] = str(round((d.loc["Paid",("Payments","Mar","NOS")] / d.loc["Total",("Payments","Mar","NOS")])*100)) + '%'
with suffix

Select row separately by first index value, round and convert to integers, last to strings and add %:
d.loc["% Paid"] = d.loc["% Paid"].round().astype(int).astype(str).add(' %')
print (d)
Payments
Jan Feb Mar
NOS NOS NOS
Total 9991.0 3638.0 5433.0
Paid 139.0 123.0 20.0
% Paid 1 % 3 % 0 %

Normalising units/Replace substrings based on lists using Python

I am trying to normalize weight units in a string.
Eg:
1.SUCO MARACUJA COM GENGIBRE PCS 300 Millilitre - SUCO MARACUJA COM GENGIBRE PCS 300 ML
2. OVOS CAIPIRAS ANA MARIA BRAGA 10UN - OVOS CAIPIRAS ANA MARIA BRAGA 10U
3. SUCO MARACUJA MAMAO PCS 300 Gram - SUCO MARACUJA MAMAO PCS 300 G
4. SUCO ABACAXI COM MACA PCS 300Milli litre - SUCO ABACAXI COM MACA PCS 300ML
The keyword table is :
unit = ['Kilo','Kilogram','Gram','Milligram','Millilitre','Milli
litre','Dozen','Litre','Un','Und','Unid','Unidad','Unidade','Unidades']
norm_unit = ['KG','KG','G','MG','ML','ML','DZ','L','U','U','U','U','U','U']
I tried to take up these lists as a table but am having difficulty in comparing two dataframes or tables in python.
I tried the below code.
unit = ['Kilo','Kilogram','Gram','Milligram','Millilitre','Milli
litre','Dozen','Litre','Un','Und','Unid','Unidad','Unidade','Unidades']
norm_unit = ['KG','KG','G','MG','ML','ML','DZ','L','U','U','U','U','U','U']
z='SUCO MARACUJA COM GENGIBRE PCS 300 Millilitre'
#for row in mongo_docs:
#z = row['clean_hntproductname']
for x in unit:
for y in norm_unit:
if (re.search(r'\s'+x+r'$',z,re.I)):
# clean_hntproductname = t.lower().replace(x.lower(),y.lower())
# myquery3 = { "_id" : row['_id']}
# newvalues3 = { "$set": {"clean_hntproductname" : 'clean_hntproductname'} }
# ds_hnt_prod_data.update_one(myquery3, newvalues3)
I'm using Python(Jupyter) with MongoDb(Compass). Fetching data from Mongo and writing back to it.

From my understanding you want to:
Update all the rows in a table which contain the words in the unit array, to the ones in norm_unit.
(Disclaimer: I'm not familiar with MongoDB or Python.)
What you want is to create a mapping (using a hash) of the words you want to change.
Here's a trivial solution (i.e. not best solution but would probably point you in the right direction.)
unit_conversions = {
'Kilo': 'KG'
'Kilogram': 'KG',
'Gram': 'G'
}
# pseudo-code
for each row that you want to update
item_description = get the value of the string in the column
for each key in unit_conversion (e.g. 'Kilo')
see if the item_description contains the key
if it does, replace it with unit_convertion[key] (e.g. 'KG')
update the row

Can't print entire data set with fits from astropy.io

I have a large fits file (21.4 MB). I would like to print the contents of it to a text file, but can only access a portion of it. I am looking for help getting the entire file to text format.
> from astropy.io import fits
> hdulist = fits.open('N20190326G0041i.fits')
Information on the file. Note that everything is in the primary HDU.
> hdulist.info()
Filename: N20190326G0041i.fits
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 183 (190685, 28) float32
I can access the full header but it is extremely long. I included it at the end of this post.
> hdu = hdulist[0]
> hdu.header
However, I only get a portion of the data using hdu.data
> hdu.data
array([[ 4.0630740e+02, 4.0631021e+02, 4.0631290e+02, ...,
1.0478779e+03, 1.0478831e+03, 1.0478882e+03],
[ 2.7955999e+01, 3.1493999e+01, 1.2378000e+01, ...,
-4.3614998e+00, -1.8785000e+00, -8.8672000e-01],
[ 2.8534999e+00, 2.8862000e+00, 2.9282999e+00, ...,
-6.1020999e+00, -5.2989998e+00, -5.1680999e+00],
...,
[ 1.7951000e+04, 2.9099000e+04, 3.5257000e+03, ...,
1.0594000e+03, 7.9347998e+02, 1.6349001e+02],
[ 3.1568999e+03, 3.1631001e+03, 3.2426001e+03, ...,
3.2828000e+02, 3.2062000e+02, 3.2189001e+02],
[ 3.3338000e+03, 3.3806001e+03, 3.4557000e+03, ...,
2.1803000e+02, 2.2574001e+02, 2.3003999e+02]], dtype=float32)
What I typically do to print fits files to text files is ...
> table = hdulist[0].data
> print(table, file = open('test.txt','a'))
This "works", and outputs the same excerpt of the data that hdu.data prints on screen.
> [[ 4.0630740e+02 4.0631021e+02 4.0631290e+02 ... 1.0478779e+03
1.0478831e+03 1.0478882e+03]
[ 2.7955999e+01 3.1493999e+01 1.2378000e+01 ... -4.3614998e+00
-1.8785000e+00 -8.8672000e-01]
[ 2.8534999e+00 2.8862000e+00 2.9282999e+00 ... -6.1020999e+00
-5.2989998e+00 -5.1680999e+00]
...
[ 1.7951000e+04 2.9099000e+04 3.5257000e+03 ... 1.0594000e+03
7.9347998e+02 1.6349001e+02]
[ 3.1568999e+03 3.1631001e+03 3.2426001e+03 ... 3.2828000e+02
3.2062000e+02 3.2189001e+02]
[ 3.3338000e+03 3.3806001e+03 3.4557000e+03 ... 2.1803000e+02
2.2574001e+02 2.3003999e+02]]
Also, I repeated all of the above things using memmap = True, but get the same results.
> from astropy.io import fits
> hdulist = fits.open('N20190326G0041i.fits', memmap = True)
I also tried the convenience functions, but that produced the exact same excerpt as hdu.data.
> tbdata = fits.getdata('N20190326G0041i.fits')
> print(tbdata,file=open('test.txt','a'))
I also tried the astropy.table package, but could not get it to work either.
> from astropy.table import Table
> t = Table.read(hdulist[0], format = 'fits')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/anaconda3/lib/python3.7/site-packages/astropy/table/connect.py", line 52, in __call__
out = registry.read(cls, *args, **kwargs)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/registry.py", line 523, in read
data = reader(*args, **kwargs)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/connect.py", line 195, in read_table_fits
memmap=memmap)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 151, in fitsopen
lazy_load_hdus, **kwargs)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 390, in fromfile
lazy_load_hdus=lazy_load_hdus, **kwargs)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1039, in _readfrom
fileobj = _File(fileobj, mode=mode, memmap=memmap, cache=cache)
File "/anaconda3/lib/python3.7/site-packages/astropy/utils/decorators.py", line 521, in wrapper
return function(*args, **kwargs)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/file.py", line 180, in __init__
self._open_filelike(fileobj, mode, overwrite)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/file.py", line 533, in _open_filelike
"method, required for mode '{}'.".format(self.mode))
OSError: File-like object does not have a 'write' method, required for mode 'ostream'.
However, if I use
> t=Table.read(hdu.data, format='fits')
then I get a different error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/anaconda3/lib/python3.7/site-packages/astropy/table/connect.py", line 52, in __call__
out = registry.read(cls, *args, **kwargs)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/registry.py", line 523, in read
data = reader(*args, **kwargs)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/connect.py", line 195, in read_table_fits
memmap=memmap)
File "/anaconda3/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 147, in fitsopen
if not name:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I have also tried tprint and tdump in PyRAF, it simply gives the error "Warning: ", with no other helpful information. I also tried writing the fits file to text using the wspectext function in PyRaf, but it is unable to do so.
However, I want to be able to just use the fits package from astropy to output the data to a text file. I have done this countless times with other fits files from a variety of telescope pipelines, but it isn't working this time. Any help is much appreciated.
The header of the file is below. I'm wondering since the header was written in IDL if the only way I'll be able to access the info is through IDL? This seems unlikely to me. I'd really rather avoid using IDL if possible. We have a site license at our university, but we are not permitted on campus during the pandemic and our VPN capabilities are lacking.
> hdu.header
SIMPLE = T / Written by IDL:
BITPIX = -32 / Real*4 (complex, stored as float)
NAXIS = 2 / Number of axes
NAXIS1 = 190685 / Number of pixel columns
NAXIS2 = 28 / Number of pixel rows
INHERIT = F / No need to inherit global keywords
BZERO = 0. / Data is Unsigned Integer
BSCALE = 1. / Scale factor
IMAGESWV= 'CFHT DetCom v3.60.18 (Apr 20 2017)' / Image creation software version
OBSTYPE = 'OBJECT ' / Observation / Exposure type
EXPTYPE = 'OBJECT ' / See OBSTYPE
EXPTIME = 2561.0 / Integration time (seconds)
DARKTIME= 2561.0 / Dark current time (seconds)
DETECTOR= 'OLAPA ' / Science Detector
CCD = 'Unknown ' / Science Detector (use DETECTOR)
IMAGEID = 0 / CCD chip number
CHIPID = 0 / Use IMAGEID instead
DETSIZE = '[1:2048,1:4608]' / Total data pixels in full mosaic
RASTER = 'FULL ' / Active raster description
CCDSUM = '1 1 ' / Binning factors
CCDBIN1 = 1 / Binning factor along first axis
CCDBIN2 = 1 / Binning factor along second axis
PIXSIZE = 13.5 / Pixel size for both axes (microns)
AMPLIST = 'a,b ' / List of amplifiers for this image
CCDSIZE = '[1:2048,1:4608]' / Detector imaging area size
CCDSEC = '[21:2068,1:4608]' / Read out area of the detector (unbinned)
TRIMSEC = '[21:2068,4:4605]' / Useful imaging area of the detector
BSECA = '[1:20,1:4608]' / Overscan/prescan (bias) area from Amp A
BSECB = '[2069:2088,1:4608]' / Overscan/prescan (bias) area from Amp B
CSECA = '[21:1044,1:4608]' / Section in full CCD for DSECA
CSECB = '[1045:2068,1:4608]' / Section in full CCD for DSECB
DSECA = '[21:1044,1:4608]' / Imaging area from Amp A
DSECB = '[1045:2068,1:4608]' / Imaging area from Amp B
TSECA = '[21:1044,4:4605]' / Trim section for Amp A
TSECB = '[1045:2068,4:4605]' / Trim section for Amp B
MAXLIN = 65535 / Maximum linearity value (ADU)
SATURATE= 65535 / Saturation value (ADU)
GAINA = 1.10 / Amp A gain (electrons/ADU)
GAINB = 1.20 / Amp B gain (electrons/ADU)
RDNOISEA= 2.90 / Amp A read noise (electrons)
RDNOISEB= 2.90 / Amp B read noise (electrons)
DARKCUR = 0 / Dark current (e-/pixel/hour)
RDTIME = 30.00 / Read out time (sec)
CONSWV = 'olD=137,DCU=49' / Controller software DSPID and SERNO versions
DETSTAT = 'ok ' / Detector temp range (-105..-90)
DETTEM = -100.2 / Detector temp deg C = 745.502 + -0.278 * 3042
INSTRUME= 'GRACES ' / Instrument Name
ECAMFOC = -3.61 / ESPaDOnS camera focus position (mm)
EHARTPOS= 'OUT ' / ESPaDOnS hartmann position FULL/DOWN/UP/OUT
EEMSHUT = 'CLOSE ' / ESPaDOnS exposure meter shutter OPEN/CLOSED
EEMSTATE= 'OFF ' / ESPaDOnS exposure meter state ON/OFF
EEMCNTS = -9999 / ESPaDOnS exposure meter count average
ETSP1BEG= 17.38 / ESPaDOnS down mirror temp at start (deg C)
ETSP2BEG= 17.61 / ESPaDOnS camera temp at start (deg C)
ETSP3BEG= 17.55 / ESPaDOnS up mirror temp at start (deg C)
ETSP4BEG= 17.31 / ESPaDOnS hygrometer temp at start (deg C)
EPRSPBEG= -3.35 / ESPaDOnS relative pressure at start (mb)
ERHSPBEG= 21.11 / ESPaDOnS relative humidity at start (%)
ETSP1END= 17.38 / ESPaDOnS down mirror temp at end (deg C)
ETSP2END= 17.59 / ESPaDOnS camera temp at end (deg C)
ETSP3END= 17.55 / ESPaDOnS up mirror temp at end (deg C)
ETSP4END= 17.31 / ESPaDOnS hygrometer temp at end (deg C)
EPRSPEND= -3.29 / ESPaDOnS relative pressure at end (mb)
EREADSPD= 'Slow: 2.90e noise, 1.15e/ADU, 30s' / ESPaDOnS det read out xslow/slow
GSLIPOS = 'TWOSLICE' / GRACES slicer bench position (# and mm)
GSLICER = 'TWOSLICE' / GRACES slicer position (# and deg)
GDEKKER = 'TWOSLICE' / GRACES dekker position (# and mm)
GPMIRROR= 'GEMINI ' / GRACES pickoff mirror position (# and mm)
GFIBMODE= 'GRACES ' / GRACES fiber position (ESPADONS or GRACES)
O_BSCALE= 1.00000 / Original BSCALE Value
RAWIQ = '85-percentile' /Raw Image Quality
RAWCC = '50-percentile' /Raw Cloud Cover
RAWWV = '20-percentile' /Raw Water Vapour/Transparency
RAWBG = '50-percentile' /Raw Background
TELESCOP= 'Gemini-North' /Gemini-North
EPOCH = 2000.00 /Epoch for Target coordinates
CRPA = 80.7411581783 /Current Cass Rotator Position Angle
AIRMASS = '1.271 ' /Mean airmass for the observation
AMSTART = '1.365 ' /Airmass at start of exposure
AMEND = '1.191 ' /Airmass at end of exposure
HA = '-03:02:41.87' /Hour Angle Sexagesimal
HAD = '-3.0449640' /Hour Angle Decimal
OBSCLASS= 'science ' /Observe class
INSTMODE= 'Spectroscopy, star+sky' /Observing mode
RAWPIREQ= 'YES ' /PI Requirements Met
RAWGEMQA= 'USABLE ' /Gemini Quality Assessment
COMMENT ----------------------------------------------------
COMMENT | Processed by the CFHT OPERA Open Source Pipeline |
COMMENT ----------------------------------------------------
COMMENT opera-1.0.1228 build date Fri May 19 18:53:30 HST 2017
COMMENT Processing Date
COMMENT ---------------
COMMENT Mon May 4 14:50:27 HST 2020
COMMENT ------------------------------------------------------------------------
COMMENT 20
SNR22 = '0.50798 / 0.61051' / snr per spectral / ccd bin
SNR23 = '0.84309 / 1.0133' / snr per spectral / ccd bin
SNR24 = '1.2171 / 1.4627' / snr per spectral / ccd bin
SNR25 = '1.9608 / 2.3567' / snr per spectral / ccd bin
SNR26 = '2.1154 / 2.5425' / snr per spectral / ccd bin
SNR27 = '2.1107 / 2.5368' / snr per spectral / ccd bin
SNR28 = '2.2236 / 2.6725' / snr per spectral / ccd bin
SNR29 = '2.1241 / 2.5528' / snr per spectral / ccd bin
SNR30 = '2.2457 / 2.6991' / snr per spectral / ccd bin
SNR31 = '1.9948 / 2.3975' / snr per spectral / ccd bin
SNR32 = '1.6974 / 2.04' / snr per spectral / ccd bin
SNR33 = '1.4978 / 1.8001' / snr per spectral / ccd bin
SNR34 = '1.2949 / 1.5562' / snr per spectral / ccd bin
SNR35 = '1.3562 / 1.63' / snr per spectral / ccd bin
SNR36 = '1.0198 / 1.2257' / snr per spectral / ccd bin
SNR37 = '1.5021 / 1.8053' / snr per spectral / ccd bin
SNR38 = '1.01 / 1.2139' / snr per spectral / ccd bin
SNR39 = '0.71061 / 0.85405' / snr per spectral / ccd bin
SNR40 = '0.59577 / 0.71603' / snr per spectral / ccd bin
SNR41 = '0.62022 / 0.74541' / snr per spectral / ccd bin
SNR42 = '0.57458 / 0.69056' / snr per spectral / ccd bin
SNR43 = '0.52749 / 0.63397' / snr per spectral / ccd bin
SNR44 = '0.49149 / 0.5907' / snr per spectral / ccd bin
SNR45 = '0.48021 / 0.57714' / snr per spectral / ccd bin
SNR46 = '0.50494 / 0.60687' / snr per spectral / ccd bin
SNR47 = '0.52551 / 0.63159' / snr per spectral / ccd bin
SNR48 = '0.5106 / 0.61367' / snr per spectral / ccd bin
SNR49 = '0.42632 / 0.51237' / snr per spectral / ccd bin
SNR50 = '0.42451 / 0.5102' / snr per spectral / ccd bin
SNR51 = '0.42025 / 0.50508' / snr per spectral / ccd bin
SNR52 = '0.41137 / 0.49441' / snr per spectral / ccd bin
SNR53 = '0.41708 / 0.50127' / snr per spectral / ccd bin
SNR54 = '0.41904 / 0.50362' / snr per spectral / ccd bin
SNR55 = '0.4284 / 0.51487' / snr per spectral / ccd bin
HRV = -8.9311 / Heliocentric RV correction (km/s)
HRVLUNAR= 0.0125599 / lunar component of HRV correction (km/s)
HRVORBIT= -9.19911 / orbital component of HRV correction (km/s)
HRVDIURN= 0.255451 / diurnal component of HRV correction (km/s)
HJDUTC = 2458568.79603 / Heliocentric Julian date (UTC) mid-exposure
HJDTT = 2458568.796831 / Heliocentric Julian date (TT) mid-exposure
TELLRV = 0. / telluric RV correction (km/s)
TELLERR = 0. / telluric RV correction error (km/s)
REDUCTIO= 'Intensity' / Type of reduction
NORMAL = '2 ' / Normalized and Un-normalized Data
COMMENT File contains automatic wavelength correction and uncorrected data.
COL1 = 'Wavelength' / Normalized
COL2 = 'Star ' / Normalized
COL3 = 'Sky ' / Normalized
COL4 = 'Star+sky' / Normalized
COL5 = 'ErrorBarStar' / Normalized
COL6 = 'ErrorBarSky' / Normalized
COL7 = 'ErrorBarStar+Sky' / Normalized
COL8 = 'Wavelength' / UnNormalized
COL9 = 'Star ' / UnNormalized
COL10 = 'Sky ' / UnNormalized
COL11 = 'Star+sky' / UnNormalized
COL12 = 'ErrorBarStar' / UnNormalized
COL13 = 'ErrorBarSky' / UnNormalized
COL14 = 'ErrorBarStar+Sky' / UnNormalized
COL15 = 'Wavelength' / Normalized, no autowave correction
COL16 = 'Star ' / Normalized, no autowave correction
COL17 = 'Sky ' / Normalized, no autowave correction
COL18 = 'Star+sky' / Normalized, no autowave correction
COL19 = 'ErrorBarStar' / Normalized, no autowave correction
COL20 = 'ErrorBarSky' / Normalized, no autowave correction
COL21 = 'ErrorBarStar+Sky' / Normalized, no autowave correction
COL22 = 'Wavelength' / UnNormalized, no autowave correction
COL23 = 'Star ' / UnNormalized, no autowave correction
COL24 = 'Sky ' / UnNormalized, no autowave correction
COL25 = 'Star+sky' / UnNormalized, no autowave correction
COL26 = 'ErrorBarStar' / UnNormalized, no autowave correction
COL27 = 'ErrorBarSky' / UnNormalized, no autowave correction
COL28 = 'ErrorBarStar+Sky' / UnNormalized, no autowave correction

Since they seemed to help I'm rewriting my above comments as an answer:
The data in FITS files (e.g. hdulist[0].data) are returned as Numpy arrays. Numpy is a core library to many scientific Python packages for working with binary array-like data. This is good to be aware of because it's not specific to Astropy or FITS, and any question about how to work with data from FITS files (that isn't Astronomy-specific) is really a question about Numpy. Numpy has a built-in function np.savetxt for this purpose. E.g.
>>> import numpy as np
>>> np.savetxt(hdulist[0].data)
np.savetxt takes numerous options you can read about in the API documentation for how to format your data.
(Side question: Why do you want to save it as a plain text file? It seems like an fairly large array--is there something you want to do with it as a text file that you can't in binary?)
Second, are a couple reasons your attempts to use Table.read failed. For one, your data does not appear to be tabular, so this wouldn't be appropriate for CCD data in this form. Second Table.read is a more abstract function that just takes the name of a file (or a "file-like object, meaning something that has the same interface of the file objects returned from Python's built-in open function). It automatically guesses how to read the tabular data by recognizing some supported file formats. You were passing it objects it doesn't know what to do with, hence getting seemingly obscure errors.
For example Table.read is used like this:
>>> Table.read('path/to/fits/file/containing/table.fits')
Under the hood this is using the astropy.io.fits package to parse the FITS file and extract table contents. The interface is designed to abstract away those details from the user. In your first attempt you passed it an actual HDU object (hdulist[0]). Since you specified format='fits' it tries to use its FITS reader on this object, but it doesn't know what to do with it because it's not a filename or a file-like object. Similar problem with your second attempt.
Finally for the reason this didn't work as you were hoping:
>>> tbdata = fits.getdata('N20190326G0041i.fits')
>>> print(tbdata,file=open('test.txt','a'))
This is no different from print(tbdata). When you print a Numpy array to the screen it has a standard print representation which for large arrays normally truncates the data. Using file= doesn't do anything magic: It just outputs the same thing you would get printing the array to the terminal, but it outputs that text to a file instead of the screen.

Orange Data Table with specific domain

I’m trying to create an Orange Data Table from a csv-file. To achieve this I'm currently trying to do this using the following steps:
Create the target domain
Reading the file to a temporary data table
Creating a new data table using the data in the temp table and the
target domain
Changing the csv to a tab-file with a three line header (https://docs.orange.biolab.si/3/data-mining-library/reference/data.io.html) is not an option.
When translating this procedure to code I get the following:
from Orange.data import Domain, DiscreteVariable, ContinuousVariable, Table
# Creating specific domain. Two attributes and a Class variable used as target
target_domain = Domain([ContinuousVariable.make("Attribute 1"),ContinuousVariable.make("Attribute 2")],DiscreteVariable.make("Class"))
print('Target domain:',target_domain)
# Target domain: [Attribute 1, Attribute 2 | Class]
# Reading in the file
test_data = Table.from_file('../data/knn_trainingset_example.csv')
print('Domain from file:',test_data.domain)
# Domain from file: [Attribute 1, Attribute 2, Class]
# Using specific domain with test_data
final_data = Table.from_table(target_domain,test_data)
print('Domain:',final_data.domain)
print('Data:')
print(final_data)
# Domain: [Attribute 1, Attribute 2 | Class]
# Data:
# [[0.800, 6.300 | ?],
# [1.400, 8.100 | ?],
# [2.100, 7.400 | ?],
# [2.600, 14.300 | ?],
# [6.800, 12.600 | ?],
# [8.800, 9.800 | ?],
# ...
As you can see from the final print statement the class variable is unknown (?) instead of the expected class (+ or -).
Can someone explain/solve this behavior? Provide a better/different way to create a Data Table with a specific domain?

Yep, thanks! As described in the reference (https://docs.orange.biolab.si/3/data-mining-library/reference/data.variable.html#discrete-variables), you have to supply the possible valeus. So providing those as a tuple did the trick. For future reference I placed the adjusted code below.
from Orange.data import Domain, DiscreteVariable, ContinuousVariable, Table
# Creating specific domain. Two attributes and a Class variable used as target
target_domain = Domain([ContinuousVariable.make("Attribute 1"),ContinuousVariable.make("Attribute 2")],DiscreteVariable.make("Class",values=('+','-')))
print('Target domain:',target_domain)
# Target domain: [Attribute 1, Attribute 2 | Class]
# Reading in the file
test_data = Table.from_file('../data/knn_trainingset_example.csv')
print('Domain from file:',test_data.domain)
# Domain from file: [Attribute 1, Attribute 2, Class]
print('Data:')
print(test_data)
# [[0.800, 6.300 | −],
# [1.400, 8.100 | −],
# [2.100, 7.400 | −],
# [2.600, 14.300 | +],
# [6.800, 12.600 | −],
# [8.800, 9.800 | +],
# ...
# Using specific domain with test_data
final_data = Table.from_table(target_domain,test_data)
print('Domain:',final_data.domain)
# Domain: [Attribute 1, Attribute 2 | Class]
print('Data:')
# Data:
# [[0.800, 6.300 | −],
# [1.400, 8.100 | −],
# [2.100, 7.400 | −],
# [2.600, 14.300 | +],
# [6.800, 12.600 | −],
# [8.800, 9.800 | +],
# ...

NEO M8T with RTKRCV on Raspberry Pi 3

I am currently doing a university project and my goal is to find the position of a rover in real time.
My set up is the following: 2x NEO M8T boards connected to a Raspberry Pi 3 (updated to latest version GNU/Linux 8).
The reason that they are both connected to the Pi is that I am not sure that my SiK telemetry transmits anything as even in RTK Navi on a laptop I don't get base station data. (the radios are matched)
The M8Ts are set to 115000 baud rate by using u-center (latest version). NMEA messages are turned off and UBX messages are turned on.
I installed the latest version of RTKLIB from tomojitakasu's github on the Pi. Ran make in the rtkrcv folder.
Ran chmod +x rtkstart.sh and chmod +x rtkshut.sh as it wanted permissions.
Started rtkrcv with sudo ./rtkrcv
I get "invalid option value pos-1snrmask" but the program still runs.
I run a conf file which I created but I DONT KNOW if it is correct.
It says "startup script ok" "rtk server start error" and thats it... nothing else.
The conf file I use is as following:
# RTKRCV options for RTK (2014/10/24, tyan)
console-passwd =admin
console-timetype =gpst # (0:gpst,1:utc,2:jst,3:tow)
console-soltype =dms # (0:dms,1:deg,2:xyz,3:enu,4:pyl)
console-solflag =1 # (0:off,1:std+2:age/ratio/ns)
## Specify connection type for Rover (1), Base (2) and Correction (3) streams
inpstr1-type =serial # (0:off,1:serial,2:file,3:tcpsvr,4:tcpcli,7:ntripcli,8:ftp,9:http)
inpstr2-type =serial # (0:off,1:serial,2:file,3:tcpsvr,4:tcpcli,7:ntripcli,8:ftp,9:http)
##inpstr3-type =serial # (0:off,1:serial,2:file,3:tcpsvr,4:tcpcli,7:ntripcli,8:ftp,9:http)
## Specify connection parameters for each stream
inpstr1-path = ttyACM0:115200:8n:1off
inpstr2-path = ttyACM1:115200:8n:1off
##inpstr3-path =
## Specify data format for each stream
inpstr1-format =ubx # (0:rtcm2,1:rtcm3,2:oem4,3:oem3,4:ubx,5:ss2,6:hemis,7:skytraq,8:sp3)
inpstr2-format =ubx # (0:rtcm2,1:rtcm3,2:oem4,3:oem3,4:ubx,5:ss2,6:hemis,7:skytraq,8:sp3)
##inpstr3-format = # (0:rtcm2,1:rtcm3,2:oem4,3:oem3,4:ubx,5:ss2,6:hemis,7:skytraq,8:sp3)
## Configure the NMEA string to send to get Base stream. Required for VRS.
inpstr2-nmeareq =off # (0:off,1:latlon,2:single)
inpstr2-nmealat =0 # (deg)
inpstr2-nmealon =0 # (deg)
## Configure where to send the solutions
outstr1-type =off # (0:off,1:serial,2:file,3:tcpsvr,4:tcpcli,6:ntripsvr)
outstr2-type =off # (0:off,1:serial,2:file,3:tcpsvr,4:tcpcli,6:ntripsvr)
## Specify here which stream contains the navigation message.
misc-navmsgsel =corr # (0:all,1:rover,1:base,2:corr)
misc-startcmd =./rtkstart.sh
misc-stopcmd =./rtkshut.sh
## Set the command file to send prior to requesting stream (if required)
file-cmdfile1 =/home/pi/rtklib/app/rtkrcv/gcc/m8t.cmd
file-cmdfile2 =/home/pi/rtklib/app/rtkrcv/gcc/m8t.cmd
## file-cmdfile3 =
pos1-posmode =static # (0:single,1:dgps,2:kinematic,3:static,4:movingbase,5:fixed,6:ppp-kine,7:ppp-static)
pos1-frequency =l1 # (1:l1,2:l1+l2,3:l1+l2+l5)
pos1-soltype =forward # (0:forward,1:backward,2:combined)
pos1-elmask =15 # (deg)
pos1-snrmask_L1 =0 # (dBHz)
pos1-dynamics =off # (0:off,1:on)
pos1-tidecorr =off # (0:off,1:on)
pos1-ionoopt =brdc # (0:off,1:brdc,2:sbas,3:dual-freq,4:est-stec)
pos1-tropopt =saas # (0:off,1:saas,2:sbas,3:est-ztd,4:est-ztdgrad)
pos1-sateph =brdc # (0:brdc,1:precise,2:brdc+sbas,3:brdc+ssrapc,4:brdc+ssrcom)
pos1-exclsats = # (prn ...)
## Set which GNSS to use. 1 is GPS only, 4 is GLONASS only. Add codes for multiple systems. Eg. (1+4)=5 is GPS+GLONASS.
pos1-navsys =7 # (1:gps+2:sbas+4:glo+8:gal+16:qzs+32:comp)
## Ambiguity Resolution mode, set to continuous to obtain fixed solutions
pos2-armode =fix-and-hold # (0:off,1:continuous,2:instantaneous,3:fix-and-hold)
pos2-gloarmode =off # (0:off,1:on,2:autocal)
pos2-arthres =3
pos2-arlockcnt =0
pos2-arelmask =0 # (deg)
pos2-aroutcnt =5
pos2-arminfix =10
pos2-slipthres =0.05 # (m)
pos2-maxage =30 # (s)
pos2-rejionno =30 # (m)
pos2-niter =1
pos2-baselen =0 # (m)
pos2-basesig =0 # (m)
out-solformat =llh # (0:llh,1:xyz,2:enu,3:nmea)
out-outhead =on # (0:off,1:on)
out-outopt =off # (0:off,1:on)
out-timesys =gpst # (0:gpst,1:utc,2:jst)
out-timeform =tow # (0:tow,1:hms)
out-timendec =3
out-degform =deg # (0:deg,1:dms)
out-fieldsep =
out-height =ellipsoidal # (0:ellipsoidal,1:geodetic)
out-geoid =internal # (0:internal,1:egm96,2:egm08_2.5,3:egm08_1,4:gsi2000)
out-solstatic =all # (0:all,1:single)
out-nmeaintv1 =0 # (s)
out-nmeaintv2 =0 # (s)
out-outstat =off # (0:off,1:state,2:residual)
stats-errratio =100
stats-errphase =0.003 # (m)
stats-errphaseel =0.003 # (m)
stats-errphasebl =0 # (m/10km)
stats-errdoppler =1 # (Hz)
stats-stdbias =30 # (m)
stats-stdiono =0.03 # (m)
stats-stdtrop =0.3 # (m)
stats-prnaccelh =1 # (m/s^2)
stats-prnaccelv =0.1 # (m/s^2)
stats-prnbias =0.0001 # (m)
stats-prniono =0.001 # (m)
stats-prntrop =0.0001 # (m)
stats-clkstab =5e-12 # (s/s)
ant1-postype =llh # (0:llh,1:xyz,2:single,3:posfile,4:rinexhead,5:rtcm)
ant1-pos1 =0 # (deg|m)
ant1-pos2 =0 # (deg|m)
ant1-pos3 =0 # (m|m)
ant1-anttype =
ant1-antdele =0 # (m)
ant1-antdeln =0 # (m)
ant1-antdelu
=0 # (m)
Please, help!!
Regards
Arnaudov

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to count how many instances of a list's items occur in a column with multiple string entries per row? - python-3.x

Related

Add suffix to a specific row in pandas dataframe

Normalising units/Replace substrings based on lists using Python

Can't print entire data set with fits from astropy.io

Orange Data Table with specific domain

NEO M8T with RTKRCV on Raspberry Pi 3

Categories

Resources