Django model generates hundreds of files in a saving loop

Django model generates hundreds of files in a saving loop - python-3.x

I'm using Django 3.8.2 on Ubuntu 18.04, I'm generating a PDF file upon saving a django model but files are generated in a loop endlessly.
I have this django model:
class Fattura(models.Model):
ordine = models.ForeignKey(Ordine, on_delete=models.CASCADE, related_name="fatture", null=False)
pdf = models.FileField(upload_to='archivio/fatture/%Y/%m')
the pdf field is generated when an instance is saved, based on the information contained in the related "ordine" field which is a ForeignKey to this other model:
class Ordine(models.Model):
utente = models.ForeignKey(User, on_delete=models.CASCADE, related_name="ordini", null=False)
data = models.DateField(auto_now_add=True)
abbonamento = models.ForeignKey(Abbonamento, on_delete=models.PROTECT, null=False)
importo = models.DecimalField(max_digits = 5, decimal_places = 2, null=False)
I declared every instructions for the generation of the PDF inside the save() method of my Fattura model. I'm using the library that is recommended by Django's documentation: reportlab. Here is the custom save method:
def save(self, *args, **kwargs):
self.count_save += 1 # I defined this attribute which I increment to understand what is actually looping
print("COUNT SAVE: " + str(self.count_save)) # it always grows, that's the save method being re-called
if self.pdf._file == None:
try:
buffer = io.BytesIO()
p = canvas.Canvas(buffer)
p.drawString(100,100, str(self.ordine))
p.showPage()
p.save()
buffer.seek(0)
utente = self.ordine.utente
num_questa_fattura = utente.ordini.count()
nome_file = "{}_{}-{}.pdf".format(
self.ordine.utente.first_name.lower(),
self.ordine.utente.last_name.lower(),
num_questa_fattura)
percorso = '{}/upload/archivio/fatture/{}/{}/{}'.format(
BASE_DIR, # from settings.py
self.ordine.data.year,
self.ordine.data.month,
nome_file)
file_temporaneo = NamedTemporaryFile(delete=True)
file_temporaneo.write(buffer.getbuffer())
file_temporaneo.flush()
temp_file = File(file_temporaneo, name = nome_file)
print(nome_file)
print(file_temporaneo)
print("--------------------- SAVE")
self.pdf.save(nome_file, file_temporaneo, save=True) # saving the field (EDIT: this was the problem, setting save to True resaves the whole model object, and it was creating an infinite loop)
file_temporaneo.close()
except:
raise ValidationError("Invoice could not be saved")
super().save(*args, **kwargs) # saving the object to the database
When I save a Fattura object with my Django admin panel hundreds of PDF files are generated inside the folder I declared, in a loop, and no database saving really completes. No Django model object is available afterward so nothing is finally saved to the database and precisely 499 files are generated each time.
I'm not sure what is causing this loop, maybe the pdf field save method in the try statement? Or the final super().save(*args, **kwargs)? I can't remove it though, otherwise nothing would be saved if a PDF file is actually associated with one instance and I'm updating the other field ordine
Here's an excerpt from my logs
[Tue Nov 02 15:25:06.974768 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] COUNT SAVE: 262
[Tue Nov 02 15:25:07.010584 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] maria_rosselli-1.pdf
[Tue Nov 02 15:25:07.015778 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] <tempfile._TemporaryFileWrapper object at 0x7f7b07ac0390>
[Tue Nov 02 15:25:07.020944 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] --------------------- SAVE
[Tue Nov 02 15:25:07.124451 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] COUNT SAVE: 263
[Tue Nov 02 15:25:07.164934 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] maria_rosselli-1.pdf
[Tue Nov 02 15:25:07.170144 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] <tempfile._TemporaryFileWrapper object at 0x7f7b07ac0e48>
[Tue Nov 02 15:25:07.175250 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] --------------------- SAVE
[Tue Nov 02 15:25:07.253426 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] COUNT SAVE: 264
[Tue Nov 02 15:25:07.286589 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] maria_rosselli-1.pdf
[Tue Nov 02 15:25:07.291734 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] <tempfile._TemporaryFileWrapper object at 0x7f7b07ac0e10>
[Tue Nov 02 15:25:07.296894 2021] [wsgi:error] [pid 10050] [remote 95.x.x.x:36048] --------------------- SAVE

you need to set save to False to call the model instance's save methid again when saving the pdf:
self.pdf.save(nome_file, file_temporaneo, save=False)

Related

AttributeError: 'super' object has no attribute 'S'

Still learning and new to code. I am working on (really playing around and learning) a self motivated projected to try and log stock HALTS cleanly. I am able to get the symbols I want, I can even trigger the status message when a HALT occurs by streaming.
However, when a status message occurs, I cannot parse the object correctly and cannot figure out why.
To start, I scan from symbols and make a list. I then create a for loop to subscribe to status messages for those symbols:
for symbol in symbols:
stream.subscribe_statuses(on_status, symbol)
My handler, 'on_status' is where I think , I am having the problem.
async def on_status(status):
symbol = status.symbol
status_message = status.status_message
logging.info("==============")
logging.info(f"{symbol} | {status_message}")
logging.info("==============")
When a status message is streamed I receive an error:
Sep 27 10:27:12 error during websocket communication: 'super' object has no attribute 'S'
Sep 27 10:27:12 Traceback (most recent call last):
Sep 27 10:27:12 File "/app/.heroku/python/lib/python3.10/site-packages/alpaca_trade_api/stream.py", line 254, in _run_forever
Sep 27 10:27:12 await self._consume()
Sep 27 10:27:12 File "/app/.heroku/python/lib/python3.10/site-packages/alpaca_trade_api/stream.py", line 130, in _consume
Sep 27 10:27:12 await self._dispatch(msg)
Sep 27 10:27:12 File "/app/.heroku/python/lib/python3.10/site-packages/alpaca_trade_api/stream.py", line 381, in _dispatch
Sep 27 10:27:12 await handler(self._cast(msg_type, msg))
Sep 27 10:27:12 File "/app/tading_halts.py", line 1184, in on_status
Sep 27 10:27:12 logging.info(f"{symbol} | {status_message}")
Sep 27 10:27:12 File "/app/.heroku/python/lib/python3.10/site-packages/alpaca_trade_api/entity_v2.py", line 135, in __getattr__
Sep 27 10:27:12 return super().__getattr__(self._reversed_mapping[key])
Sep 27 10:27:12 File "/app/.heroku/python/lib/python3.10/site-packages/alpaca_trade_api/entity.py", line 149, in __getattr__
Sep 27 10:27:12 return getattr(super(), key)
Sep 27 10:27:12 AttributeError: 'super' object has no attribute 'S'
Now, if I just use {status} I do get the StatusV2 object but I cannot figure out how to parse out the symbol and the message from that object. The object comes back as this:
Sep 27 10:00:09 | StatusV2({ 'reason_code': '',
Sep 27 10:00:09 'reason_message': '',
Sep 27 10:00:09 'status_code': 'T',
Sep 27 10:00:09 'status_message': 'Trading Resumption',
Sep 27 10:00:09 'symbol': 'ATXI',
Sep 27 10:00:09 'tape': 'C',
Sep 27 10:00:09 'timestamp': 1664287209393738868})
Any help would greatly be appreciated as I continue to learn.

Spark/Kafka error after updating Spark 2.4 to Spark 3.1

I tried to upgrade from Juypterhub using pyspark 3.1.2 (using Python 3.7) using Debian Linux with Kafka from Spark 2.4.1 to Spark 3.1.2. Therefore, I also update Kafka from 2.4.1 to 2.8 but this does not seem to be the problem. I checked the dependencies from https://spark.apache.org/docs/latest/ and it seems fine so far.
For Spark 2.4.1 I used these additional jar in the sparks directory:
slf4j-api-1.7.26.jar
unused-1.0.0.jar
lz4-java-1.6.0.jar
kafka-clients-2.3.0.jar
spark-streaming-kafka-0-10_2.11-2.4.3.jar
spark-sql-kafka-0-10_2.11-2.4.3.jar
For Spark 3.1.2 I updated these jars and already added some more the other file already existed like unused:
spark-sql-kafka-0-10_2.12-3.1.2.jar
spark-streaming-kafka-0-10_2.12-3.1.2.jar
spark-streaming-kafka-0-10-assembly_2.12-3.1.2.jar
spark-token-provider-kafka-0-10_2.12-3.1.2.jar
kafka-clients-2.8.0.jar
I striped my pyspark code to this that works with spark 2.4.1 but not with Spark 3.1.2:
from pyspark.sql import SparkSession
import pyspark.sql.functions as F
import pyspark.sql.types as T
from pyspark.sql.utils import AnalysisException
import datetime
# configuration of target db
db_target_url = "jdbc:mysql://localhost/test"
db_target_properties = {"user": "john", "password": "doe"}
# create spark session
spark = SparkSession.builder.appName("live1").getOrCreate()
spark.conf.set('spark.sql.caseSensitive', True)
# create schema for the json iba data
schema_tww_vs = T.StructType([T.StructField("[22:8]", T.DoubleType()),\
T.StructField("[1:3]", T.DoubleType()),\
T.StructField("Timestamp", T.StringType())])
# create dataframe representing the stream and take the json data into a usable df structure
d = spark.readStream \
.format("kafka").option("kafka.bootstrap.servers", "localhost:9092") \
.option("subscribe", "test_so") \
.load() \
.selectExpr("timestamp", "cast (value as string) as json") \
.select("timestamp", F.from_json("json", schema_tww_vs).alias("struct")) \
.selectExpr("timestamp", "struct.*") \
# add timestamp of this spark processing
d = d.withColumn("time_spark", F.current_timestamp())
d1 = d.withColumnRenamed('[1:3]','signal1') \
.withColumnRenamed('[22:8]','ident_orig') \
.withColumnRenamed('timestamp','time_kafka') \
.withColumnRenamed('Timestamp','time_source')
d1 = d1.withColumn("ident", F.round(d1["ident_orig"]).cast('integer'))
d4 = d1.where("signal1 > 3000")
d4a = d4.withWatermark("time_kafka", "1 second") \
.groupby('ident', F.window('time_kafka', "5 second")) \
.agg(
F.count("*").alias("count"), \
F.min("time_kafka").alias("time_start"), \
F.round(F.avg("signal1"),1).alias('signal1_avg'),)
# Remove the column "windows" since this struct (with start and stop time) cannot be written to the db
d4a = d4a.drop('window')
d8a = d4a.select('time_start', 'ident', 'count', 'signal1_avg')
# write the dataframe into the database using the streaming mode
def write_into_sink(df, epoch_id):
df.write.jdbc(table="test_so", mode="append", url=db_target_url, properties=db_target_properties)
pass
query_write_sink = d8a.writeStream \
.foreachBatch(write_into_sink) \
.trigger(processingTime = "1 seconds") \
.start()
Some of the errors are:
java.lang.NoClassDefFoundError: org/apache/commons/pool2/impl/GenericKeyedObjectPoolConfig
Jul 22 15:41:22 run [847]: #011at org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer$.<init>(KafkaDataConsumer.scala:623)
Jul 22 15:41:22 run [847]: #011at org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer$.<clinit>(KafkaDataConsumer.scala)
…
jupyterhub-start.sh[847]: 21/07/22 15:41:22 ERROR TaskSetManager: Task 0 in stage 2.0 failed 1 times; aborting job
Jul 22 15:41:22 run [847]: 21/07/22 15:41:22 ERROR MicroBatchExecution: Query [id = 5d2a70aa-1463-48f3-a4a6-995ceef22891, runId = d1f856b5-eb0c-4635-b78a-d55e7ce81f2b] terminated with error
Jul 22 15:41:22 run [847]: py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: Traceback (most recent call last):
Jul 22 15:41:22 run [847]: File "/opt/anaconda/envs/env1/lib/python3.7/site-packages/py4j/java_gateway.py", line 2451, in _call_proxy
Jul 22 15:41:22 run [847]: return_value = getattr(self.pool[obj_id], method)(*params)
Jul 22 15:41:22 run [847]: File "/opt/spark/python/pyspark/sql/utils.py", line 196, in call
Jul 22 15:41:22 run [847]: raise e
Jul 22 15:41:22 run [847]: File "/opt/spark/python/pyspark/sql/utils.py", line 193, in call
Jul 22 15:41:22 run [847]: self.func(DataFrame(jdf, self.sql_ctx), batch_id)
Jul 22 15:41:22 run [847]: File "<ipython-input-10-d40564c31f71>", line 3, in write_into_sink
Jul 22 15:41:22 run [847]: df.write.jdbc(table="test_so", mode="append", url=db_target_url, properties=db_target_properties)
Jul 22 15:41:22 run [847]: File "/opt/spark/python/pyspark/sql/readwriter.py", line 1445, in jdbc
Jul 22 15:41:22 run [847]: self.mode(mode)._jwrite.jdbc(url, table, jprop)
Jul 22 15:41:22 run [847]: File "/opt/anaconda/envs/env1/lib/python3.7/site-packages/py4j/java_gateway.py", line 1310, in __call__
Jul 22 15:41:22 run [847]: answer, self.gateway_client, self.target_id, self.name)
Jul 22 15:41:22 run [847]: File "/opt/spark/python/pyspark/sql/utils.py", line 111, in deco
Jul 22 15:41:22 run [847]: return f(*a, **kw)
Jul 22 15:41:22 run [847]: File "/opt/anaconda/envs/env1/lib/python3.7/site-packages/py4j/protocol.py", line 328, in get_return_value
Jul 22 15:41:22 run [847]: format(target_id, ".", name), value)
Jul 22 15:41:22 run [847]: py4j.protocol.Py4JJavaError: An error occurred while calling o101.jdbc.
Jul 22 15:41:22 run [847]: : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 200) (master executor driver): java.lang.NoClassDefFoundError: org/apache/commons/pool2/impl/GenericKeyedObjectPoolConfig
Do you have ideas what causes this error?

As devesh said there was one jar file missing:
commons-pool2-2.8.0.jar that can be downloaded from https://mvnrepository.com/artifact/org.apache.commons/commons-pool2/2.8.0

Scraping works fine until appended to list [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 3 years ago.
Improve this question
I am a beginner trying to scrape bitcoin price history, everything works fine until I try to append it to a list, as nothing ends up being appended to the list.
import requests
from bs4 import BeautifulSoup
import pandas as pd
from datetime import datetime
url = 'https://coinmarketcap.com/currencies/bitcoin/historical-data/?start=20130428&end=20190821'
page = requests.get(url).content
soup = BeautifulSoup(page, 'html.parser')
priceDiv = soup.find('div', attrs={'class':'table-responsive'})
rows = priceDiv.find_all('tr')
data = []
i = 0
for row in rows:
temp = []
tds = row.findChildren()
for td in tds:
temp.append(td.text)
if i > 0:
temp[0] = temp[0].replace(',', '')
temp[6] = temp[6].replace(',', '')
if temp[5] == '-':
temp[5] = 0
else:
temp[5] = temp[5].replace(',', '')
data.append({'date': datetime.strptime(temp[0], '%b %d %Y'),
'open': float(temp[1]),
'high': float(temp[2]),
'low': float(temp[3]),
'close': float(temp[4]),
'volume': float(temp[5]),
'market_cap': float(temp[6])})
i += 1
df = pd.DataFrame(data)
If I try to print df or data it is just empty.

As noted above, you need to increment i outside that the check for > 0.
Secondly, have you considered using pandas .read_html(). That will do the hard work for you.
Code:
import pandas as pd
url = 'https://coinmarketcap.com/currencies/bitcoin/historical-data/?start=20130428&end=20190821'
dfs = pd.read_html(url)
df = dfs[0]
Output:
print (df)
Date Open* ... Volume Market Cap
0 Aug 20, 2019 10916.35 ... 15053082175 192530283565
1 Aug 19, 2019 10350.28 ... 16038264603 195243306008
2 Aug 18, 2019 10233.01 ... 12999813869 185022920955
3 Aug 17, 2019 10358.72 ... 13778035685 182966857173
4 Aug 16, 2019 10319.42 ... 20228207096 185500055339
5 Aug 15, 2019 10038.42 ... 22899115082 184357666577
6 Aug 14, 2019 10889.49 ... 19990838300 179692803424
7 Aug 13, 2019 11385.05 ... 16681503537 194762696644
8 Aug 12, 2019 11528.19 ... 13647198229 203441494985
9 Aug 11, 2019 11349.74 ... 15774371518 205941632235
10 Aug 10, 2019 11861.56 ... 18125355447 202890020455
11 Aug 09, 2019 11953.47 ... 18339989960 211961319133
12 Aug 08, 2019 11954.04 ... 19481591730 213788089212
13 Aug 07, 2019 11476.19 ... 22194988641 213330426789
14 Aug 06, 2019 11811.55 ... 23635107660 205023347814
15 Aug 05, 2019 10960.74 ... 23875988832 210848822060
16 Aug 04, 2019 10821.63 ... 16530894787 195907875403
17 Aug 03, 2019 10519.28 ... 15352685061 193233960601
18 Aug 02, 2019 10402.04 ... 17489094082 187791090996
19 Aug 01, 2019 10077.44 ... 17165337858 185653203391
20 Jul 31, 2019 9604.05 ... 16631520648 180028959603
21 Jul 30, 2019 9522.33 ... 13829811132 171472452506
22 Jul 29, 2019 9548.18 ... 13791445323 169880343827
23 Jul 28, 2019 9491.63 ... 13738687093 170461958074
24 Jul 27, 2019 9871.16 ... 16817809536 169099540423
25 Jul 26, 2019 9913.13 ... 14495714483 176085968354
26 Jul 25, 2019 9809.10 ... 15821952090 176806451137
27 Jul 24, 2019 9887.73 ... 17398734322 175005760794
28 Jul 23, 2019 10346.75 ... 17851916995 176572890702
29 Jul 22, 2019 10596.95 ... 16334414913 184443440748
... ... ... ... ...
2276 May 27, 2013 133.50 ... - 1454029510
2277 May 26, 2013 131.99 ... - 1495293015
2278 May 25, 2013 133.10 ... - 1477958233
2279 May 24, 2013 126.30 ... - 1491070770
2280 May 23, 2013 123.80 ... - 1417769833
2281 May 22, 2013 122.89 ... - 1385778993
2282 May 21, 2013 122.02 ... - 1374013440
2283 May 20, 2013 122.50 ... - 1363709900
2284 May 19, 2013 123.21 ... - 1363204703
2285 May 18, 2013 123.50 ... - 1379574546
2286 May 17, 2013 118.21 ... - 1373723882
2287 May 16, 2013 114.22 ... - 1325726787
2288 May 15, 2013 111.40 ... - 1274623813
2289 May 14, 2013 117.98 ... - 1243874488
2290 May 13, 2013 114.82 ... - 1315710011
2291 May 12, 2013 115.64 ... - 1281982625
2292 May 11, 2013 117.70 ... - 1284207489
2293 May 10, 2013 112.80 ... - 1305479080
2294 May 09, 2013 113.20 ... - 1254535382
2295 May 08, 2013 109.60 ... - 1264049202
2296 May 07, 2013 112.25 ... - 1240593600
2297 May 06, 2013 115.98 ... - 1249023060
2298 May 05, 2013 112.90 ... - 1288693176
2299 May 04, 2013 98.10 ... - 1250316563
2300 May 03, 2013 106.25 ... - 1085995169
2301 May 02, 2013 116.38 ... - 1168517495
2302 May 01, 2013 139.00 ... - 1298954594
2303 Apr 30, 2013 144.00 ... - 1542813125
2304 Apr 29, 2013 134.44 ... - 1603768865
2305 Apr 28, 2013 135.30 ... - 1488566728
[2306 rows x 7 columns]

Python logging call crashes at usesTime

Every once in 50-100 calls to the logger, the program crashes with the following trace messages:
Mar 20 07:10:14 service.bash[7693]: Fatal Python error: Cannot recover from stack overflow.
Mar 20 07:10:14 service.bash[7693]: Current thread 0x76fa3010 (most recent call first):
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 381 in usesTime
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 537 in usesTime
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 569 in format
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 831 in format
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 981 in emit
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 856 in handle
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 1488 in callHandlers
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 1426 in handle
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 1416 in _log
Mar 20 07:10:14 service.bash[7693]: File "/usr/lib/python3.5/logging/__init__.py", line 1280 in info
Mar 20 07:10:14 service.bash[7693]: File "*****************_app/base.py", line 63 in log_this
Any idea what could be causing this crash?
Don't see similar or other logging calls elsewhere in the program crashing it.
Here is the stack of the calls made to the logger:
self.info("cs={} miso={} mosi{} clk{}".format( self.csPin, self.misoPin, self.mosiPin, self.clkPin))
|
self.log_this("info", msg)
|
self.log.info(msg)
The logger is setup in the base class initialization routine in the following way:
# Global logger is declared as a class attribute
cls.log = logging.getLogger(cls.args["app"])
c_handler = logging.StreamHandler()
f_handler = logging.handlers.RotatingFileHandler(
cls.args["--log"],
maxBytes=(10**6)*int(cls.args["--logsize"]), # CLI is in MB
backupCount=1)
# Create handlers
if cls.args["--debug"]:
cls.log.setLevel(logging.DEBUG)
c_handler.setLevel(logging.DEBUG)
f_handler.setLevel(logging.DEBUG)
else:
cls.log.setLevel(logging.INFO)
c_handler.setLevel(logging.INFO)
f_handler.setLevel(logging.INFO)
# Create formatters and add it to handlers
c_format = logging.Formatter('%(name)s: %(levelname)s: %(message)s')
f_format = logging.Formatter('%(asctime)s: %(name)s: %(levelname)s: %(message)s')
c_handler.setFormatter(c_format)
f_handler.setFormatter(f_format)
# Add handlers to the logger
cls.log.addHandler(c_handler)
cls.log.addHandler(f_handler)
cls.log.info("Logger initialized")
Thank you.

Are you using Windows and switching to an UTF8 locale within Python? There is a Python bug in this specific scenario (similar issue: https://bugs.python.org/issue36792). Here's a minimal code sample that reproduces the bug on my machine:
import locale
import logging
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter('%(asctime)s'))
logging.getLogger().setLevel(logging.DEBUG)
logging.getLogger().addHandler(handler)
logging.info(1)
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
logging.info(2)
Output:
2019-10-25 21:20:15,657
Process finished with exit code -1073740940 (0xC0000374)
If you are unsure if this might be related to your problem, try adding the following line in front of your call to the logging module and see if it solves the problem:
locale.setlocale(locale.LC_ALL, 'en_US')

crontab Failed to import the site module

I want crontab to run the python script for a public welfare project.
I can successfully run the script in Pycharm.
When I run it with crontab, there is an error.
Environment: Mac OS, python3.5
After I type 'crontab -e', it shows that:
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/Users/yy/anaconda/bin/python3:/Users/yy/anaconda/bin/
32 14 * * * PATH=$PATH:/Users/yy/anaconda/bin/ cd /Users/yy/PycharmProjects/selenium_test/ && /Users/yy/anaconda/bin/python3 /Users/yy/PycharmProjects/selenium_test/selenium_test.py >> /Users/yy/PycharmProjects/selenium_test/log.txt
I got an error as follows in the /var/mail/username:
From yy#YY.local Thu Jun 8 14:32:00 2017
Return-Path: <yy#YY.local>
X-Original-To: yy
Delivered-To: yy#YY.local
Received: by YY.local (Postfix, from userid 501)
id A7F1F38FFFCC; Thu, 8 Jun 2017 14:32:00 -0500 (CDT)
From: yy#YY.local (Cron Daemon)
To: yy#YY.local
Subject: Cron <yy#YY> PATH=$PATH:/Users/yy/anaconda/bin/ cd /Users/yy/PycharmProjects/selenium_test/ && /Users/yy/anaconda/bin/python3 /Users/yy/PycharmProjects/selenium_test/selenium_test.py >> /Users/yy/PycharmProjects/selenium_test/log.txt
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/Users/yy/anaconda/bin/python3:/Users/yy/anaconda/bin/>
X-Cron-Env: <LOGNAME=yy>
X-Cron-Env: <USER=yy>
X-Cron-Env: <HOME=/Users/yy>
Message-Id: <20170608193200.A7F1F38FFFCC#YY.local>
Date: Thu, 8 Jun 2017 14:32:00 -0500 (CDT)
Failed to import the site module
Traceback (most recent call last):
File "/Users/yy/anaconda/lib/python3.5/site.py", line 567, in <module>
main()
File "/Users/yy/anaconda/lib/python3.5/site.py", line 550, in main
known_paths = addsitepackages(known_paths)
File "/Users/yy/anaconda/lib/python3.5/site.py", line 327, in addsitepackages
addsitedir(sitedir, known_paths)
File "/Users/yy/anaconda/lib/python3.5/site.py", line 206, in addsitedir
addpackage(sitedir, name, known_paths)
File "/Users/yy/anaconda/lib/python3.5/site.py", line 162, in addpackage
for n, line in enumerate(f):
File "/Users/yy/anaconda/lib/python3.5/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 127: ordinal not in range(128)
I spent two hours on this error.
However, No solutions work...
Please help.
Thanks!
#
I use python3.5, so the default encoding is utf-8. The
UnicodeDecodeError is strange...

The problem is about encoding.
The default of Python3.5 is utf-8.
However, some packages I installed is encoded with Unicode.
I solve the problem by modifying the site.py file in path=/Users/user_name/anaconda/lib/python3.5/site.py.
Line 158:
f = open(fullname, "rb") -> f = open(fullname, "rb")
Line 163:
if line.startswith("#"): -> if line.startswith(b"#"):
Line 166:
if line.startswith(("import ", "import\t")): -> if line.startswith((b"import ", b"import\t")):
Line 170:
dir, dircase = makepath(sitedir, line) -> dir, dircase = makepath(sitedir, str(line))
I don't think it's a good idea to modify the "site.py" in anaconda...
But this fix the problem.
Hope it will be helpful.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Django model generates hundreds of files in a saving loop - python-3.x

you need to set save to False to call the model instance's save methid again when saving the pdf: self.pdf.save(nome_file, file_temporaneo, save=False)

Related

AttributeError: 'super' object has no attribute 'S'

Spark/Kafka error after updating Spark 2.4 to Spark 3.1

Scraping works fine until appended to list [closed]

Python logging call crashes at usesTime

crontab Failed to import the site module

Categories

Resources