Reading a file and parsing the data - python-3.5

I'm reading a file and printing only the lines which have "load" in the line and that's working but i want the data to be More Distinct or clear as the output data is [soi-aahh] out: 16:45:50 up 436 days, 2:06, 5 users, load average: 0.08, 0.02, 0.00 where hostname is enclosed within brackets following out which i want to remove and want the data to be looks into two column format or at least remove them :
$ cat logs.py
#!/python/v3.6.1/bin/python3
with open("file_1") as f:
data = f.read()
for line in data.splitlines():
if "load" in line:
print(line)
print("")
File contents
$ cat file_1
[soi-aahh] sudo: uptime
[soi-aahh] out: sudo password:
[soi-aahh] out: 16:45:50 up 436 days, 2:06, 5 users, load average: 0.08, 0.02, 0.00
[soi-aahh] out:
[soi-aabk] sudo: uptime
[soi-aabk] out: sudo password:
[soi-aabk] out: 16:45:50 up 586 days, 23:08, 7 users, load average: 1.01, 1.03, 1.00
[soi-aabk] out:
[soi-abrrj] sudo: uptime
[soi-abrrj] out: sudo password:
[soi-abrrj] out: 16:45:50 up 219 days, 6:31, 4 users, load average: 0.00, 0.00, 0.00
[soi-abrrj] out:
[soi-ritsh] sudo: uptime
[soi-ritsh] out: sudo password:
[soi-ritsh] out: 16:45:50 up 586 days, 23:13, 15 users, load average: 5.01, 5.02, 5.04
[soi-ritsh] out:
Script output:
$ ./logs.py
[soi-aahh] out: 16:45:50 up 436 days, 2:06, 5 users, load average: 0.08, 0.02, 0.00
[soi-aabk] out: 16:45:50 up 586 days, 23:08, 7 users, load average: 1.01, 1.03, 1.00
[soi-abrrj] out: 16:45:50 up 219 days, 6:31, 4 users, load average: 0.00, 0.00, 0.00
[soi-ritsh] out: 16:45:50 up 586 days, 23:13, 15 users, load average: 5.01, 5.02, 5.04
Desired:
Hostname Uptime
soi-aahh 16:45:50 up 436 days
OR at least below:
soi-aahh: 16:45:50 up 436 days, 2:06, 5 users, load average: 0.08, 0.02, 0.00
Please suggest if there is better way to read the file and do this.

You may want to trim the line:
with open("file_1") as f:
data = f.read()
print('Hostname \t Uptime')
for line in data.splitlines():
if "load" in line:
print(line.replace('] out: ', '\t').strip('['))
print("")

Related

Massaging data issues for an AI learning project using python which I'm also learning

I'm having some issues with cleaning up data to be used in a prediction project, March madness picks. Sample raw data;
Winner Stats
,,Overall,Overall,Overall,Overall,Overall,Overall,Conf.,Conf.,Home,Home,Away,Away,Points,Points,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced,School Advanced
Rk,School,Games,Wins,Loss,W-L%,SRS,SOS,Wins,Loss,Wins,Loss,Wins,Loss,Team,Opp.,Pace,ORtg,FTr,3PAr,TS%,TRB%,AST%,STL%,BLK%,eFG%,TOV%,ORB%,FT/FGA
1,Abilene Christian NCAA,29,24,5,0.828,6.27,-6.37,13,2,13,0,5,4,2196,1753,72.0,105.2,0.339,0.367,0.540,51.5,65.3,13.5,8.9,0.512,16.2,31.3,0.230
2,Air Force,25,5,20,0.200,-12.98,0.22,3,17,3,8,1,10,1468,1798,63.9,91.9,0.309,0.450,0.547,40.1,59.0,11.1,8.0,0.521,21.2,15.3,0.214
3,Akron,23,15,8,0.652,1.85,-1.96,12,6,9,1,5,5,1798,1660,70.1,110.2,0.303,0.461,0.555,51.7,48.6,6.9,8.9,0.520,14.3,31.0,0.230
The first issues was the data had several headers lines, so used this code to skip those lines;
**WinnerStats = pd.read_csv('Winners_Stats.csv', skiprows=2)**
This worked and gave;
Rk School Games Wins Loss W-L% SRS SOS Wins.1 Loss.1 ... 3PAr TS% TRB% AST% STL% BLK% eFG% TOV% ORB% FT/FGA
0 1 Abilene Christian NCAA 29 24 5 0.828 6.27 -6.37 13 2 ... 0.367 0.540 51.5 65.3 13.5 8.9 0.512 16.2 31.3 0.230
1 2 Air Force 25 5 20 0.200 -12.98 0.22 3 17 ... 0.450 0.547 40.1 59.0 11.1 8.0 0.521 21.2 15.3 0.214
2 3 Akron 23 15 8 0.652 1.85 -1.96 12 6 ... 0.461 0.555 51.7 48.6 6.9 8.9 0.520 14.3 31.0 0.230
[356 rows x 29 columns]
Now, reading the data information, if the team name has NCAA append at the end, they went to the tournament. So I wrote this code after much research to filter all the rows that had NCAA in the name.
**mask1 = WinnerStats['School'].str.contains('NCAA', case=False, na=False)
ws1 = WinnerStats[mask1]**
This worked.
Rk School Games Wins Loss W-L% SRS SOS Wins.1 Loss.1 ... 3PAr TS% TRB% AST% STL% BLK% eFG% TOV% ORB% FT/FGA
0 1 Abilene Christian 29 24 5 0.828 6.27 -6.37 13 2 ... 0.367 0.540 51.5 65.3 13.5 8.9 0.512 16.2 31.3 0.230
6 7 Alabama 33 26 7 0.788 19.58 10.01 16 2 ... 0.465 0.544 51.7 50.2 11.3 10.6 0.517 15.9 31.8 0.204
10 11 Appalachian State 29 17 12 0.586 -5.84 -5.72 7 8 ... 0.437 0.520 50.8 49.9 11.6 10.4 0.478 14.9 29.8 0.252
[68 rows x 29 columns]
So, now I'm at the point where I want to start using the data to predict some outcomes (this is old data so I already know what the program should be able to predict). I need to match the two teams up T1 vs T2. I have another dataset that has the match ups, but it's by team ID. I have a third datasets that has team name and teamID.
**TeamID = pd.read_csv('MTeams.csv')
mask1 = WinnerStats['School'].str.contains('NCAA', case=False, na=False)
ws1 = WinnerStats[mask1] # just the NCAA teams
ws1['School'] = WinnerStats['School'].str.slice(start=0, stop=-5) #Remove NCAA from the school name**
This code gets all rows that has NCAA in the name via mask into ws1. But the last line to remove the NCAA works, but gives this warning;
mm2021_Data_Cleaner.py:46: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
ws1['School'] = WinnerStats['School'].str.slice(start=0, stop=-5)
I don't know enough about python to say, I can ignore the warning, or make some kind change. I followed the link, and none of the explanation seem to make any sense to me and what I'm trying to do.
The second problem is trying to code up a way to compare the names from the TeamID to the WinningStats. To match up TeamID to WinnerStat I needs to match up on "similar school name"; Example. "Abilene Christian" (from WinnerStat) to "Abilene Chr" (TeamID). So I this code
**ws1['TeamID'] = TeamID['TeamID'].where(WinnerStats['School'].str.contains(TeamID['School']))**
gives this error;
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'School'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/March_Madness/mm2021_Data_Cleaner.py", line 46, in <module>
ws1['TeamID'] = TeamID['TeamID'].where(WinnerStats['School'].str.contains(TeamID['School']))
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 3505, in __getitem__
indexer = self.columns.get_loc(key)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc
raise KeyError(key) from err
KeyError: 'School'
dennis#MBP2021 March_Madness %
I have tried several different permutations, other code that I'm pretty lost and the "think" this is the last code I tried. All with similar strange error messages.

Extracting specific lines from multiple text files in the same folder for the entire folder

I am working with meteorological data that I wished to extract lines 319-356 for the grids that I am working with, for all the stations (the resultant text file should contain the header and the data in line 319-356). I have developed the following code in Python but it returned blank text files. May I know how should I improve it? Thank you in advance!
import os
for filename in os.listdir('P11'):
if filename.endswith(".txt"):
with open(os.path.join('P11', filename), 'r') as f:
data = f.read()
header = f.readline()
content = f.readlines()[1:]
needed_lines = content[320:357]
output_strings = map(header, needed_lines)
output_content = " ".join(output_strings)
with open(os.path.join('P11', filename), 'wt') as outfile:
outfile.write(output_content)
This is how the result should look like:
20080101
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.05
0.01
0.00
0.00
0.00
0.00
0.00
0.26
0.00
0.00
0.01
0.01
1.08
0.96
0.65
0.00
0.00
0.00
0.00
0.00
0.34
0.00
This works. Replace your specific things as needed like the file names and the lines to get.
#file names
fileName = "./textFile"
outFileName = "./file2"
#open file
file = open(fileName)
#get each file line
lines = file.readlines()
#get rid of any new line characters
lines = [i.replace("\n", "") for i in lines]
print(lines)
#close file
file.close()
#open output
outF = open(outFileName, "w")
#make output
s = "".join(lines[2:6])
#write desired lines
outF.write(s)
#close output
outF.close()
Here it is without comments:
file = open(fileName)
lines = file.readlines()
lines = [i.replace("\n", "") for i in lines]
file.close()
outF = open("./file2", "w")
s = "".join(lines[2:6])
outF.write(s)
outF.close()
Edit: Some insight I have about your code is that using with essentially acts as a function so any variables created inside of it will not be visible outside of the with. This means that the second with will not see output_content. Also why are you using map()?

Missing Date xticks on chart for matplotlib on Python 3. Bug?

I am following this section, I realize this code was made using Python 2 but they have xticks showing on the 'Start Date' axis and I do not. My chart only shows Start Date and no dates are provided. I have attempted to convert the object to datetime but that shows the dates and breaks the graph below it and the line is missing:
Graph
# Set as_index=False to keep the 0,1,2,... index. Then we'll take the mean of the polls on that day.
poll_df = poll_df.groupby(['Start Date'],as_index=False).mean()
# Let's go ahead and see what this looks like
poll_df.head()
Start Date Number of Observations Obama Romney Undecided Difference
0 2009-03-13 1403 44 44 12 0.00
1 2009-04-17 686 50 39 11 0.11
2 2009-05-14 1000 53 35 12 0.18
3 2009-06-12 638 48 40 12 0.08
4 2009-07-15 577 49 40 11 0.09
Great! Now plotting the Difference versus time should be straight forward.
# Plotting the difference in polls between Obama and Romney
fig = poll_df.plot('Start Date','Difference',figsize=(12,4),marker='o',linestyle='-',color='purple')
Notebook is here

Bash format uptime to show days, hours, minutes

I'm using uptime in bash in order to get the current runtime of the machine. I need to grab the time and display a format like 2 days, 12 hours, 23 minutes.
My uptime produces output that looks like:
$ uptime
12:49:10 up 25 days, 21:30, 28 users, load average: 0.50, 0.66, 0.52
To convert that to your format:
$ uptime | awk -F'( |,|:)+' '{print $6,$7",",$8,"hours,",$9,"minutes."}'
25 days, 21 hours, 34 minutes.
How it works
-F'( |,|:)+'
awk divides its input up into fields. This tells awk to use any combination of one or more of space, comma, or colon as the field separator.
print $6,$7",",$8,"hours,",$9,"minutes."
This tells awk to print the sixth field and seventh fields (separated by a space) followed by a comma, the 8th field, the string hours, the ninth field, and, lastly, the string minutes..
Handling computers with short uptimes using sed
Starting from a reboot, my uptime produces output like:
03:14:20 up 1 min, 2 users, load average: 2.28, 1.29, 0.50
04:12:29 up 59 min, 5 users, load average: 0.06, 0.08, 0.48
05:14:09 up 2:01, 5 users, load average: 0.13, 0.10, 0.45
03:13:19 up 1 day, 0 min, 8 users, load average: 0.01, 0.04, 0.05
04:13:19 up 1 day, 1:00, 8 users, load average: 0.02, 0.05, 0.21
12:49:10 up 25 days, 21:30, 28 users, load average: 0.50, 0.66, 0.52
The following sed command handles these formats:
uptime | sed -E 's/^[^,]*up *//; s/, *[[:digit:]]* users.*//; s/min/minutes/; s/([[:digit:]]+):0?([[:digit:]]+)/\1 hours, \2 minutes/'
With the above times, this produces:
1 minutes
59 minutes
2 hours, 1 minutes
1 day, 0 minutes
1 day, 1 hours, 0 minutes
25 days, 21 hours, 30 minutes
How it works
-E turns on extended regular expression syntax. (On older GNU seds, use -r in place of -E)
s/^[^,]*up *//
This substitutes command removes all text up to up.
s/, *[[:digit:]]* users.*//
This substitute command removes the user count and all text which follows it.
s/min/minutes/
This replaces min with minutes.
s/([[:digit:]]+):0?([[:digit:]]+)/\1 hours, \2 minutes/'
If the line contains a time in the hh:mm format, this separates the hours from the minutes and replaces it with hh hours, mm minutes.
Handling computers with short uptimes using awk
uptime | awk -F'( |,|:)+' '{d=h=m=0; if ($7=="min") m=$6; else {if ($7~/^day/) {d=$6;h=$8;m=$9} else {h=$6;m=$7}}} {print d+0,"days,",h+0,"hours,",m+0,"minutes."}'
On the same test cases as above, this produces:
0 days, 0 hours, 1 minutes.
0 days, 0 hours, 59 minutes.
0 days, 2 hours, 1 minutes.
1 days, 0 hours, 0 minutes.
1 days, 1 hours, 0 minutes.
25 days, 21 hours, 30 minutes.
For those who prefer awk code spread out over multiple lines:
uptime | awk -F'( |,|:)+' '{
d=h=m=0;
if ($7=="min")
m=$6;
else {
if ($7~/^day/) { d=$6; h=$8; m=$9}
else {h=$6;m=$7}
}
}
{
print d+0,"days,",h+0,"hours,",m+0,"minutes."
}'
Just vor completeness... what's about:
$ uptime -p
up 2 weeks, 3 days, 14 hours, 27 minutes
Solution: In order to get the linux uptime in seconds, Go to bash and type cat /proc/uptime.Parse the first number and convert it according to your requirement.
From RedHat documentation:
This file contains information detailing how long the system has been on since its last restart. The output of /proc/uptime is quite minimal:
350735.47 234388.90
The First number is the total number of seconds the system has been
up.
The Second number is how much of that time the machine has spent
idle, in
seconds.
I made a universal shell script, for systems which support uptime -p like newer linux and for those that don't, like Mac OS X.
#!/bin/sh
uptime -p >/dev/null 2>&1
if [ "$?" -eq 0 ]; then
# Supports most Linux distro
# when the machine is up for less than '0' minutes then
# 'uptime -p' returns ONLY 'up', so we need to set a default value
UP_SET_OR_EMPTY=$(uptime -p | awk -F 'up ' '{print $2}')
UP=${UP_SET_OR_EMPTY:-'less than a minute'}
else
# Supports Mac OS X, Debian 7, etc
UP=$(uptime | sed -E 's/^[^,]*up *//; s/mins/minutes/; s/hrs?/hours/;
s/([[:digit:]]+):0?([[:digit:]]+)/\1 hours, \2 minutes/;
s/^1 hours/1 hour/; s/ 1 hours/ 1 hour/;
s/min,/minutes,/; s/ 0 minutes,/ less than a minute,/; s/ 1 minutes/ 1 minute/;
s/ / /; s/, *[[:digit:]]* users?.*//')
fi
echo "up $UP"
Gist
Referenced John1024 answer with my own customizations.
For this:
0 days, 0 hours, 1 minutes.
0 days, 0 hours, 59 minutes.
0 days, 2 hours, 1 minutes.
1 days, 0 hours, 0 minutes.
1 days, 1 hours, 0 minutes.
25 days, 21 hours, 30 minutes
More simple is:
uptime -p | cut -d " " -f2-
For the sake of variety, here's an example with sed:
My raw output:
$ uptime
15:44:56 up 3 days, 22:58, 7 users, load average: 0.48, 0.40, 0.31
Converted output:
$uptime|sed 's/.*\([0-9]\+ days\), \([0-9]\+\):\([0-9]\+\).*/\1, \2 hours, \3 minutes./'
3 days, 22 hours, 58 minutes.
This answer is pretty specific for the uptime shipped in OS X, but takes into account any case of output.
#!/bin/bash
INFO=`uptime`
echo $INFO | awk -F'[ ,:\t\n]+' '{
msg = "↑ "
if ($5 == "day" || $5 == "days") { # up for a day or more
msg = msg $4 " " $5 ", "
n = $6
o = $7
} else {
n = $4
o = $5
}
if (int(o) == 0) { # words evaluate to zero
msg = msg int(n)" "o
} else { # hh:mm format
msg = msg int(n)" hr"
if (n > 1) { msg = msg "s" }
msg = msg ", " int(o) " min"
if (o > 1) { msg = msg "s" }
}
print "[", msg, "]"
}'
Some example possible outputs:
22:49 up 24 secs, 2 users, load averages: 8.37 2.09 0.76
[ ↑ 24 secs ]
22:50 up 1 min, 2 users, load averages: 5.59 2.39 0.95
[ ↑ 1 min ]
23:39 up 51 mins, 3 users, load averages: 2.18 1.94 1.74
[ ↑ 51 mins ]
23:54 up 1:06, 3 users, load averages: 3.67 2.57 2.07
[ ↑ 1 hr, 6 mins ]
16:20 up 120 days, 10:46, 3 users, load averages: 1.21 2.88 0.80
[ ↑ 120 days, 10 hrs, 46 mins ]
uptime_minutes() {
set `uptime -p`
local minutes=0
shift
while [ -n "$1" ]; do
case $2 in
day*)
((minutes+=$1*1440))
;;
hour*)
((minutes+=$1*60))
;;
minute*)
((minutes+=$1))
;;
esac
shift
shift
done
echo $minutes
}

Referring to objects using variable strings in R

Edit: Thanks to those who have responded so far; I'm very much a beginner in R and have just taken on a large project for my MSc dissertation so am a bit overwhelmed with the initial processing. The data I'm using is as follows (from WMO publically available rainfall data):
120 6272100 KHARTOUM 15.60 32.55 382 1899 1989 0.0
1899 0.03 0.03 0.03 0.03 0.03 1.03 13.03 12.03 9999 6.03 0.03 0.03
1900 0.03 0.03 0.03 0.03 0.03 23.03 80.03 47.03 23.03 8.03 0.03 0.03
1901 0.03 0.03 0.03 0.03 0.03 17.03 23.03 17.03 0.03 8.03 0.03 0.03
(...)
120 6272101 JEBEL AULIA 15.20 32.50 380 1920 1988 0.0
1920 0.03 0.03 0.03 0.00 0.03 6.90 20.00 108.80 47.30 1.00 0.01 0.03
1921 0.03 0.03 0.03 0.00 0.03 0.00 88.00 57.00 35.00 18.50 0.01 0.03
1922 0.03 0.03 0.03 0.00 0.03 0.00 87.50 102.30 10.40 15.20 0.01 0.03
(...)
There are ~100 observation stations that I'm interested in, each of which has a varying start and end date for rainfall measurements. They're formatted as above in a single data file, with stations separated by "120 (station number) (station name)".
I need first to separate this file by station, then to extract March, April, May and June for each year, then take a total of these months for each year. So far I'm messing around with loops (as below), but I understand this isn't the right way to go about it and would rather learn some better technique.
Thanks again for the help!
(Original question:)
I've got a large data set containing rainfall by season for ~100 years over 100+ locations. I'm trying to separate this data into more managable arrays, and in particular I want to retrieve the sum of the rainfall for March, April, May and June for each station for each year.
The following is a simplified version of my code so far:
a <- array(1,dim=c(10,12))
for (i in 1:5) {
all data:
assign(paste("station_",i,sep=""), a)
#march - june data:
assign(paste("station_",i,"_mamj",sep=""), a[,4:7])
}
So this gives me station_(i)__mamj_ which contains the data for the months I'm interested in for each station. Now I want to sum each row of this array and enter it in a new array called station_(i)_mamj_tot. Simple enough in theory, but I can't work out how to reference station_(i)_mamj so that it varies the value of i with each iteration. Any help much appreciated!
This is totally begging for a dataframe, then it's just this one-liner with power-tools like ddply (amazingly powerful):
tot_mamj <- ddply(rain[rain$month %in% 3:6,-2], 'year', colwise(sum))
giving your aggregate of total for M/A/M/J, by year:
year station_1 station_2 station_3 station_4 station_5 ...
1 1972 8.618960 5.697739 10.083192 9.264512 11.152378 ...
2 1973 18.571748 18.903280 11.832462 18.262272 10.509621 ...
3 1974 22.415201 22.670821 32.850745 31.634717 20.523778 ...
4 1975 16.773286 17.683704 18.259066 14.996550 19.007762 ...
...
Below is perfectly working code. We create a dataframe whose col.names are 'station_n'; also extra columns for year and month (factor, or else integer if you're lazy, see the footnote). Now you can do arbitrary analysis by month or year (using plyr's split-apply-combine paradigm):
require(plyr) # for d*ply, summarise
#require(reshape) # for melt
# Parameterize everything here, it's crucial for testing/debugging
all_years <- c(1970:2011)
nYears <- length(all_years)
nStations <- 101
# We want station names as vector of chr (as opposed to simple indices)
station_names <- paste ('station_', 1:nStations, sep='')
rain <- data.frame(cbind(
year=rep(c(1970:2011),12),
month=1:12
))
# Fill in NAs for all data
rain[,station_names] <- as.numeric(NA)
# Make 'month' a factor, to prevent any numerical funny stuff e.g accidentally 'aggregating' it
rain$month <- factor(rain$month)
# For convenience, store the row indices for all years, M/A/M/J
I.mamj <- which(rain$month %in% 3:6)
# Insert made-up seasonal data for M/A/M/J for testing... leave everything else NA intentionally
rain[I.mamj,station_names] <- c(3,5,9,6) * runif(4*nYears*nStations)
# Get our aggregate of MAMJ totals, by year
# The '-2' column index means: "exclude month, to prevent it also getting 'aggregated'"
excludeMonthCol = -2
tot_mamj <- ddply(rain[rain$month %in% 3:6, excludeMonthCol], 'year', colwise(sum))
# voila!!
# year station_1 station_2 station_3 station_4 station_5
# 1 1972 8.618960 5.697739 10.083192 9.264512 11.152378
# 2 1973 18.571748 18.903280 11.832462 18.262272 10.509621
# 3 1974 22.415201 22.670821 32.850745 31.634717 20.523778
# 4 1975 16.773286 17.683704 18.259066 14.996550 19.007762
As a footnote, before I converted month from numeric to factor, it was getting silently 'aggregated' (until I put in the '-2': exclude column reference).
However, better still is when you make it a factor, it will refuse point-blank to be aggregate'd, and throw an error (which is desirable for debugging):
ddply(rain[rain$month %in% 3:6, ], 'year', colwise(sum))
Error in Summary.factor(c(3L, 3L, 3L, 3L, 3L, 3L), na.rm = FALSE) :
sum not meaningful for factors
For your original question, use get():
i <- 10
var <- paste("test", i, sep="_")
assign(10, var)
get(var)
As David said, this is probably not the best path to be taking, but it can be useful at times (and IMO the assign/get construct is far better than eval(parse))
Why are you using assign to create variables like station1, station2, station_3_mamj and so on? It would be much easier and more intuitive to store them in a list, like stations[[1]], stations[[2]], stations_mamj[[3]], and such. Then each could be accessed using their index.
Since it looks like each piece of per-station data you're working with is a matrix of the same size, you could even deal with them as a three-dimensional matrix.
ETA: Incidentally, if you really want to solve the problem this way, you would do:
eval(parse(text=paste("station", i, "mamj", sep="_")))
But don't- using eval is almost always bad practices, and will make it difficult to do even simple operations on your data.

Resources