Getting excel files to run on python - python-3.x

I have the following code in jupyter notebook, using python. I get the error when I run it saying "FileNotFoundError" but all of the files are in a labeled folder.
file_path=os.path.dirname(os.path.abspath("__file__"))
df=pd.read_csv(file_path+ "\\score_NFL.csv",encoding="utf-8")
teams=pd.read_csv(file_path+"\\nfl_teams.csv",encoding='utf-8')
games_elo=pd.read_csv(file_path+"\\nfl_games3.csv",encoding="utf-8")
games_elo18=pd.read_csv(file_path + "\\nfl_games_2019_1.csv",encoding="utf-8")

You don't want to have quotes around __file__. It refers to a special object created when running a script or using an imported module (see the answers here for details). Your first line should be
file_path=os.path.dirname(os.path.abspath(__file__))
However, you state in your question that you're attempting to run this code in a Jupyter notebook. In an instance like this, similar to using the REPL on the command line, __file__ is not defined because you're not running the script from a file - it's interactive.
This method will work if you save your code in a .py file and run it from the directory containing your CSV files. At the same time, if you're running the code from the same directory, you don't need to go through all the hassle of creating a full absolute path to the CSVs, you can simply use
df = pd.read_csv("score_NFL.csv", encoding="utf-8")
for example.

Related

How can I load data from an external drive into a Jupyter notebook on MacOs?

I have running code in my home directory in Jupyter notebook. I'd like to import some data from an external drive, as the data is too large to be stored on my local drive. I am unable to import this data into my notebook.
If you can browse to the data using Finder, you should be able to load the data into your Jupyter notebook.
Your code to open a file in your notebook probably looks something like this:
with open("data.txt","r") as dataFile:
You need to replace data.txt with the path to the file you want to open.
One way to get the path name is to navigate to the file in Finder and then drag the file into your Terminal. The path to the file will then be pasted into the command prompt of the Terminal.
The path may look something like this: /Volumes/Remove/Data/bigfile.txt
You can also navigate to the file using the cd command in the Terminal.
You may even be able to drag the file directly into your Jupyter notebook to get the path.
Hope this helps!

How do I run a .py script?

I just started learning Python last week to automate some stuff I do (thanks to automatetheboringstuff.com). Assume I know nothing about programming. The only thing I know is HTML and CSS.
I created a simple automation workflow already and I want to improve not the code (maybe in the future because it's not yet finished) but how I can maintain my setup/program on two laptops -- Both Mac OS running on High Sierra.
I have a .py file that contains my automated workflow. I don't know where to place it. It currently resides in my Dropbox so i can use it on laptop1 and laptop2.
I also created a virtualenv for each machine and did the requirements.txt thing as well (just to prep for the future). The directory is on both username/python/project_name.
I read in some posts that these files and other resources can exist anywhere whether inside each virtualenv or not. And that it's just a preference. I also read that the virtualenv itself isn't recommended to be placed inside apps like Dropbox (that's why i separated it on each laptop).
I switch between both laptops frequently. The environment which contains the packages doesn't really concern me that much when switching. It's the other files that is bothering me. For example, there's an image I need, this has to be available on both laptops so my solution to this is to have a Resources folder inside Dropbox as well. It currently looks like this:
Dropbox
Projects
Project 1 files (images, etc.)
Project 2 files (images, etc.)
Workflows (this would contain my completed .py files)
I read some stuff about the virtualenvwrapper, but haven't looked at it yet. Maybe in the future when i do have more projects to manage. Because right now, it's just this one.
Lastly, I noticed that every time i open up Terminal and activate my virtualenv, the file directory is in Users/username
How can i set it to default to Dropbox/Projects/project_name? I always have to set it using the chdir(). That way, when i do have multiple projects (and virtualenv) i don't have to worry about where the files load/ save.
Finally, how do I run the .py script? If i open the IDLE, open the .py file there, and use f5, it runs properly. But as far as I know, that doesn't look into the virtualenv i setup. Is that correct?
I tried right-clicking, then Open With > Python Launcher the .py file. and i'm getting an error saying there are no modules found. It seems it's not loading the right virtualenv. So there must be something wrong with the file i made.
Then I read about the #! you place at the beginning of the .py files but i don't understand it. Can someone explain that further? Is that why my file isn't loading properly?
Thanks for helping out!
You can run .py scripts from the command line using:
python test.py
That tells terminal to run test.py in the python interpreter and send the output to your terminal, just like when you run it in the IDLE. If your .py script is not in your current directory and you don't want to change directories, you can access it using it's absolute path:
python /Users/username/Dropbox/Workflows/test.py
As long as you have already activated your virtualenv, it should run your script using only the libraries you have added to your virtualenv. Also, once your virtualenv is activated, you can move around directories using "cd" and it will bring your virtualenv with you.

Cannot write to file when using task manager for python script

I have created a python script that given an input file, will run NMap on arguments from the input file. It then writes to an output file in csv format. My script works fine and as intended when I run from IDLE, but when my script runs from the task manager, it never overwrites the excel/csv file I tell my script to write to. The path I provide in the file:
ipResults = r'C:\Users\________\Documents\Results.csv'
I've left out the username for security concerns.
I've set the script to run when I log on. When I log on, I see the output/results in a taskeng.exe window with with a python symbol and rocketship. But when it finishes running, Results.csv does not get updated. As said previously, when running through IDLE, the script does overwrite Results.csv.
Currently I have set my file to both w+ changing it from w to see if that's the error but no such luck. I'm fine with the program overwriting my past results, in fact that's what I want, but when my script is ran through the task manager it does not overwrite the Results.csv file.
Simply checking the run with highest privilege box when setting the task up on task manager fixed my error, I am now able to write to my output file.

Do I need multiple run configurations - one per Python file - in Pycharm even though the only difference between them is the script?

I created a Python project in Pycharm which contains multiple Python files. As of just now, I need to create a run configuration for each Python file in my project, even though they're all the exact same - with the exception of the script.
This seems unnecessary and laborious and I would love to just use one run configuration for multiple Python files.
That said, I'm a novice Python programmer just getting started and so still unfamiliar with large parts of the language.
My Project Files:
My Run Configuration - Used for all Python files:
Some Research Carried Out
I've searched for a solution and explanation to this, but have been unable to find anything. Some of the places I've tried:
JetBrainsTV on youtube (https://www.youtube.com/watch?v=JLfd9LOdu_U)
JetBrains Website (https://www.jetbrains.com/help/pycharm/run-debug-configuration-python.html)
Stack Overflow
I hope there is sufficient detail here, if not I'd be happy to elaborate.
If those files are independent and you have nothing specific to them, then I see two simple ways of running them:
You don't have to manually create a run configuration for every file. You can just Right-Click on the file in the project tree and click "Run "
You can use the Terminal and run them files using the python interpreter as needed.
I was facing a similar situation when I started competitive programming. In my case I wanted to redirect my Test Cases from an input.txt file rather than manually typing the test cases for every run of my code. Using the above solution was not feasible, as I would need to manually change the Script Path and Redirect Input path in the Run Configuration window for every script I was running.
So what I wanted was, one run configuration, that would run all the scripts with Redirect Input path being set to input.txt.
To do that,
I created a main.py file with the following content:
import sys
if __name__ == '__main__':
fname = sys.argv[1]
exec(open(fname).read())
This main.py file is going to run my other python scripts.
Created this run configuration for the main.py file.
Now, every time I needed to run any code, with the code window open, ran this configuration, which actually executed main.py with current file name passed as its argument, which would then also take the inputs redirected from input.txt.
Hope this helps you or anyone trying to run multiple python scripts with a single run configuration in PyCharm.

PyCharm project path different from interactive session path

When running an interactive session, PyCharm thinks of os.getcwd() as my project's directory. However, when I run my script from the command line, PyCharm thinks of os.getcwd() as the directory of the script.
Is there a good workaround for this? Here is what I tried and did not like:
going to Run/Edit Configurations and changing the working directory manually. I did not like this solution, because I will have to do it for every script that I run.
having one line in my code that "fixes" the path for the purposes of interactive sessions and commenting it out before running from command line. This works, but feels wrong.
Is there a way to do this or is it just the way it is supposed to be? Maybe I shouldn't be trying to run random scripts within my project?
Any insight would be greatly appreciated.
Clarification:
By "interactive session" I mean being able to run each line individually in a Python/IPython Console
By "running from command line" I mean creating a script my_script.py and running python path_to_myscript/my_script.py (I actually press the Run button at PyCharm, but I think it's the same).
Other facts that might prove worth mentioning:
I have created a PyCharm project. This contains (among other things) the package Graphs, which contains the module Graph and some .txt files. When I do something within my Graph module (e.g. read a graph from a file), I like to test that things worked as expected. I do this by running a selection of lines (interactively). To read a .txt file, I have to go (using os.path.join()) from the current working directory (the project directory, ...\\project_name) to the module's directory ...\\project_name\\Graphs, where the file is located. However, when I run the whole script via the command line, the command reading the .txt file raises an Error, complaining that no file was found. By looking on the name of the file that was not found, I see that the full file name is something like this:
...\\project_name\\Graphs\\Graphs\\graph1.txt
It seems that this time the current working directory is ...\\project_name\\Graphs\\, and my os.path.join() command actually spoils it.
I user various methods in my python scripts.
set the working directory as first step of your code using os.chdir(some_existing_path)
This would mean all your other paths should be referenced to this, as you hard set the path. You just need to make sure it works from any location and your specifically in your IDE. Obviously, another os.chdir() would change the working directory and os.getcwd() would return the new working directory
set the working directory to __file__ by using os.chdir(os.path.dirname(__file__))
This is actually what I use most, as it is quite reliable, and then I reference all further paths or file operations to this. Or you can simply refer to as os.path.dirname(__file__) in your code without actually changing the working directory
get the working directory using os.getcwd()
And reference all path and file operations to this, knowing it will change based on how the script is launched. Note: do NOT assume that this returns the location of your script, it returns the working directory of the shell !!
[EDIT based on new information]
By "interactive session" I mean being able to run each line
individually in a Python/IPython Console
By running interactively line-by-line in a Python console, the __file__ is not defined, afterall: you are not executing a file. Hence you cannot use os.path.dirname(__file__) you will have to use something like os.chdir(some_known_existing_dir) to reference a path. As a programmer you need to be very aware of working directory and changes to this, your code should reflect that.
By "running from command line" I mean creating a script my_script.py
and running python path_to_myscript/my_script.py (I actually press the
Run button at PyCharm, but I think it's the same).
This, both executing a .py from command line as well as running in your IDE, will populate the __file__, hence you can use os.path.dirname(__file__)
HTH
I am purposely adding another answer to this post, in regards the following:
Other facts that might prove worth mentioning:
I have created a PyCharm project. This contains (among other things)
the package Graphs, which contains the module Graph and some .txt
files. When I do something within my Graph module (e.g. read a graph
from a file), I like to test that things worked as expected. I do this
by running a selection of lines (interactively). To read a .txt file,
I have to go (using os.path.join()) from the current working directory
(the project directory, ...\project_name) to the module's directory
...\project_name\Graphs, where the file is located. However, when I
run the whole script via the command line, the command reading the
.txt file raises an Error, complaining that no file was found. By
looking on the name of the file that was not found, I see that the
full file name is something like this:
...\project_name\Graphs\Graphs\graph1.txt It seems that this time
the current working directory is ...\project_name\Graphs\, and my
os.path.join() command actually spoils it.
I strongly believe that if a python script takes input from any file, that the author of the script needs to cater for this in the script.
What I mean is you as the author need to make sure you know the following regardless of how your script is executed:
What is the working directory
What is the script directory
These two you have no control over when you hand off your script to others, or run it on other peoples machines. The working directory is dependent on how the script is launched. It seems that you run on Windows, so here is an example:
C:\> c:\python\python your_script.py
The working directory is now C:\ if your_script.py is in C:\
C:\some_dir\another_dir\> c:\python\python.exe c:\your_script_dir\your_script.py
The working directory is now C:\some_dir\another_dir
And the above example may even give different results if the SYSTEM PATH variable is set to the path of the location of your_script.py
You need to ensure that your script works even if the user(s) of your script are placing this in various locations on their machines. Some people (and I don't know why) tend to put everything on the Desktop. You need to ensure your script can cope with this, including any spaces in the path name.
Furthermore, if your script is taking input from a file, the you as the author need to ensure that you can cope with changes in working directory, and changes of script directory. There are a few things you may consider:
Have your script input from a known (static) directory, something like C:\python_input\
Have your script input from a known (configurable) directory, use ConfigParser, you can search here on stackoverflow on many posts
Have your script input from a known directory related to the location of the script (using os.path.dirname(__file__))
any other method you may employ to ensure your script can get to the input
Ultimately this is all in your control, and you need to code to ensure it is working.
HTH,
Edwin.

Resources