Hi just installed Sphinx on my CentOS VPS.. But for some reason whenever I search, it gives me no result.. I'm using ssh for searching.. Here is the command
search --index sphinx_index_cc_post -a Introducing The New Solar Train Tunnel
This is the output of command
Sphinx 2.0.5-release (r3308)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/usr/local/etc/sphinx.conf'...
index 'sphinx_index_cc_post': query 'Introducing The New Solar Train Tunnel ': returned 0 matches of 0 total in 0.000 sec
words:
1. 'introducing': 0 documents, 0 hits
2. 'the': 0 documents, 0 hits
3. 'new': 0 documents, 0 hits
4. 'solar': 0 documents, 0 hits
5. 'train': 0 documents, 0 hits
6. 'tunnel': 0 documents, 0 hits
This is my index in config file
source sphinx_index_cc_post
{
type = mysql
sql_host = localhost
sql_user = user
sql_pass = password
sql_db = database
sql_port = 3306
sql_query_range = SELECT MIN(postid),MAX(postid) FROM cc_post
sql_range_step = 1000
sql_query = SELECT postedby, category, totalvotes, trendvalue, featured, isactive, postingdate \
FROM cc_post \
WHERE postid BETWEEN $start AND $end
}
index sphinx_index_cc_post
{
source = sphinx_index_cc_post
path = /usr/local/sphinx/data/sphinx_index_cc_post
charset_type = utf-8
min_word_len = 2
}
The index seems to work fine, when I rotate the index, I successfully get the documents. Here is the result of my indexer
[root#server1 data]# indexer --rotate sphinx_index_cc_post
Sphinx 2.0.5-release (r3308)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/usr/local/etc/sphinx.conf'...
indexing index 'sphinx_index_cc_post'...
WARNING: Attribute count is 0: switching to none docinfo
WARNING: source sphinx_index_cc_post: skipped 1 document(s) with zero/NULL ids
collected 2551 docs, 0.1 MB
sorted 0.0 Mhits, 100.0% done
total 2551 docs, 61900 bytes
total 0.041 sec, 1474933 bytes/sec, 60784.40 docs/sec
total 2 reads, 0.000 sec, 1.3 kb/call avg, 0.0 msec/call avg
total 6 writes, 0.000 sec, 1.0 kb/call avg, 0.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=17888).
I also tried removing attributes but no luck!! I'm guessing either its some config problem of query issue
Your query is:
SELECT postedby, category, totalvotes, trendvalue, featured, isactive, postingdate \
FROM cc_post
from column names, I guess you don't have full text in any of those columns. Are you missing the column that contains text?
Related
I'm trying to join the GHCN weather dataset and another dataset:
Weather Dataset: (called "weather" in the code)
station_id
date
PRCP
SNOW
SNWD
TMAX
TMIN
Latitude
Longitude
Elevation
State
date_final
CA001010235
19730707
null
null
0.0
0.0
null
48.4
-123.4833
17.0
BC
1973-07-07
CA001010235
19780337
14
8
0.0
0.0
null
48.4
-123.4833
17.0
BC
1978-03-30
CA001010595
19690607
null
null
0.0
0.0
null
48.5833
-123.5167
17.0
BC
1969-06-07
Species Dataset: (called "ebird" in the code where "ebird_id" is unique for each row in the dataset)
speciesCode
comName
sciName
locId
locName
ObsDt
howMany
lat
lng
obsValid
obsReview
locationPrivate
subId
ebird_id
nswowl
Northern Saw-whet Owl
Aegolius acadicus
L787133
Kumdis Slough
2017-03-20 23:15
1
53.7392187
-132.1612358
TRUE
FALSE
TRUE
S35611913
eff-178121-fff
wilsni1
Wilson's Snipe
Gallinago delicata
L1166559
Hornby Island--Ford Cove
2017-03-20 21:44
1
49.4973435
-124.6768427
TRUE
FALSE
FALSE
S35323282
abc-1920192-fff
cacgoo1
Cackling Goose
Branta hutchinsii
L833055
Central Saanich--ȾIKEL (Maber Flats)
2017-03-20 19:24
5
48.5724686
-123.4305167
TRUE
FALSE
FALSE
S35322116
yhj-9102910-fff
Result Expected: I need to join these tables by finding the closest weather station for each row in the species dataset for the same date. So in this example, ebird_id "eff-178121-fff" is closest to the weather station "CA001010235" and the distance is around 20 kms.
speciesCode
comName
sciName
locId
locName
ObsDt
howMany
lat
lng
obsValid
obsReview
locationPrivate
subId
ebird_id
station_id
date
PRCP
SNOW
SNWD
TMAX
TMIN
Latitude
Longitude
Elevation
State
date_final
distance(kms)
nswowl
Northern Saw-whet Owl
Aegolius acadicus
L787133
Kumdis Slough
2017-03-20 23:15
1
53.7392187
-132.1612358
TRUE
FALSE
TRUE
S35611913
eff-178121-fff
CA001010235
20170320
null
null
0.0
0.0
null
48.4
-123.4833
17.0
BC
2017-03-20
20
wilsni1
Wilson's Snipe
Gallinago delicata
L1166559
Hornby Island--Ford Cove
2017-03-20 21:44
1
49.4973435
-124.6768427
TRUE
FALSE
FALSE
S35323282
abc-1920192-fff
CA001010595
20170320
null
null
0.0
0.0
null
48.5833
-123.5167
17.0
BC
2017-03-20
What I have tried so far: I referred to this link and it works for a sample of the datasets but when I tried to run the below code for the entire weather and entire species dataset, the cross join worked but the partitionBy and windows function line was taking too long. I also tried replacing the partionBy and windows function with pyspark SQL queries in case that would be faster but it's still taking time. Is there any optimized way to do this?
join_df = ebird.crossJoin(weather).withColumn("dist_longit", radians(weather["Longitude"]) - radians(ebird["lng"])).withColumn("dist_latit", radians(weather["Latitude"]) - radians(ebird["lat"]))
join_df = join_df.withColumn("haversine_distance_kms", asin(sqrt(
sin(join_df["dist_latit"] / 2) ** 2 + cos(radians(join_df["lat"]))
* cos(radians(join_df["Latitude"])) * sin(join_df["dist_longit"] / 2) ** 2
)
) * 2 * 6371).drop("dist_longit","dist_latit")
W = W.partitionBy("ebird_id")
result = join_df.withColumn("min_dist", min(join_df['haversine_distance_kms']).over(W))\
.filter(col("min_dist") == col('haversine_distance_kms'))
print(result.show(1))
Edit:
Size of the datasets:
print(weather.count()) #output: 8211812
print(ebird.count()) #output: 1564574
SAP Commerce 1905
Which package should I enable (to DEBUG or INFO) to log the Solr query in the tomcat logs or the console?
I'm aware you can see the query under hybris/log/solr, but I also want to see the query in the console as it runs.
Hybris OOB supports the feature and it will give you a full query that is going to hit from Hybris to Solr server.
This Class(de.hybris.platform.solrfacetsearch.search.context.listeners.SolrQueryDebuggingListener) will give you Raw Query, Parsed Solr Query,Filter Queries, and Solr query explanation(If you are using any of the configuration like score,boost,fieldweight).
How to enable it:-->
Go to SolrFacetSearchConfig-->and add this listener solrQueryDebuggingListener to listeners list.
Hybris Console Log::
INFO [hybrisHTTP16] [SolrQueryDebuggingListener] Raw Query: {!boost}(+{!lucene v=$yq})
INFO [hybrisHTTP16] [SolrQueryDebuggingListener] Parsed Solr Query: +(DisjunctionMaxQuery(((variants_string_mv:0001)^20.0 | (ean_string:0001)^100.0 | (variantsSupercategory_text_zz_mv:0001)^50.0 | (keywords_text_zz:0001)^50.0 | (name_text_zz:0001)^90.0 | (code_string:0001)^90.0 | (categoryName_text_zz_mv:0001)^50.0 | (alias_string:0001)^90.0)) DisjunctionMaxQuery((variants_string_mv:0001* | variantsSupercategory_text_zz_mv:0001* | (name_text_zz:*0001*)^150.0 | (code_string:0001*)^45.0)) DisjunctionMaxQuery(((variants_string_mv:0001)^40.0 | (variantsSupercategory_text_zz_mv:0001)^40.0 | (name_text_zz:0001)^100.0)) DisjunctionMaxQuery(((name_text_zz:0001~1)^-1.0)))
INFO [hybrisHTTP16] [SolrQueryDebuggingListener] Filter Queries: [sopOnly_boolean:false, ((soldIndividually_boolean:tzze) AND (testOrderable_boolean:tzze) AND (testSample_boolean:false) AND (yyyIsProductVisible_warehouse_xxx_boolean:tzze) AND (yyyAvailableToSellByDate_warehouse_xxx_boolean:tzze) AND ((((*:* -bundleOnlineFrom_date:*) OR (bundleOnlineFrom_date:[* TO 2021-05-13T15:09:36.856Z])) AND ((*:* -bundleOnlineTo_date:*) OR (bundleOnlineTo_date:[2021-05-13T15:09:36.857Z TO *]))) OR (((categoryName_text_en_mv:*) OR (brandName_text_en_mv:*)) AND ((*:* -productCategoryOnlineFrom_date:*) OR (productCategoryOnlineFrom_date:[* TO 2021-05-13T15:09:36.856Z])) AND ((*:* -productCategoryOnlineTo_date:*) OR (productCategoryOnlineTo_date:[2021-05-13T15:09:36.857Z TO *])) AND ((*:* -productOnlineFrom_date:*) OR (productOnlineFrom_date:[* TO 2021-05-13T15:09:36.856Z])) AND ((*:* -productOnlineTo_date:*) OR (productOnlineTo_date:[2021-05-13T15:09:36.857Z TO *]))))), ((stockQuantity_warehouse_xxx_long:* OR variants_string_mv:*)), ((stockQuantity_warehouse_xxx_long:* OR variantsSupercategory_text_en_mv:*)), (-restrictedTiers_string_mv:400), ((*:* NOT allowedUserGroups_string_mv:*) OR (allowedUserGroups_string_mv:(TP-zzS-AAA OR cls_dc9 OR testcustomergroup OR act_dst OR system_lom3 OR Twelve Percent OR test_AAA OR Nine Percent Award OR customergroup OR lge_ind))), ((*:* NOT allowedProductUserGroups_string_mv:*) OR (allowedProductUserGroups_string_mv:(TP-DDD-AAA OR cls_dc9 OR testcustomergroup OR act_dst OR system_lom3 OR Twelve Percent OR test_AAA OR Nine Percent Award OR customergroup OR lge_ind))), (allowedProductCategoryUserGroups_string_mv:(TP-DDD-AAA OR cls_dc9 OR testcustomergroup OR act_dst OR system_lom3 OR Twelve Percent OR test_AAA OR Nine Percent Award OR customergroup OR lge_ind)), ((isVariantProduct_boolean:false AND NOT testType_string:GROUP) OR {!parent which=(isVariantProduct_boolean:false) v=$childquery}), (catalogId:"ProductCatalog" AND catalogVersion:"Online"), priceValue_DDD-AAA_zzb_double:[0 TO *], priceStartTime_DDD-AAA_zzb_date:[* TO NOW], priceEndTime_DDD-AAA_zzb_date:[NOW TO *], -stockStatus_warehouse_xxx_string:noLongerAvailable]
INFO [hybrisHTTP16] [SolrQueryDebuggingListener] Solr Query explanation: {ProductCatalog/Online/E0001RK=
90.0 = sum of:
90.0 = max of:
90.0 = weight(alias_string:0001 in 737) [SchemaSimilarity], result of:
90.0 = score(doc=737,freq=1.0), product of:
90.0 = boost
1.0 = fieldWeight in 737, product of:
1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
1.0 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
2.0 = docFreq
1621.0 = docCount
1.0 = fieldNorm(doc=737)
}
Add below loggers statements in local.properties/ any properties file.
log4j2.logger.DefaultFacetSearchStrategy.name = de.hybris.platform.solrfacetsearch.search.impl.DefaultFacetSearchStrategy
log4j2.logger.DefaultFacetSearchStrategy.level = DEBUG
log4j2.logger.DefaultFacetSearchStrategy.appenderRef.stdout.ref = STDOUT
The package or class is de.hybris.platform.solrfacetsearch.search.impl.DefaultFacetSearchStrategy.
I've been trying to figure this out for the past day, but have not been successful.
Problem I am facing
I'm reading a parquet file that is about 2GB big. The initial read is 14 partitions, then eventually gets split into 200 partitions. I perform seemingly simple SQL query that runs for 25+ mins runtime, about 22 mins is spent on a single stage. Looking in the Spark UI, I see that all computation is eventually pushed to about 2 to 4 executors, with lots of shuffling. I don't know what is going on. Please I would appreciate any help.
Setup
Spark environment - Databricks
Cluster mode - Standard
Databricks Runtime Version - 6.4 ML (includes Apache Spark 2.4.5, Scala 2.11)
Cloud - Azure
Worker Type - 56 GB, 16 cores per machine. Minimum 2 machines
Driver Type - 112 GB, 16 cores
Notebook
Cell 1: Helper functions
load_data = function(path, type) {
input_df = read.df(path, type)
input_df = withColumn(input_df, "dummy_col", 1L)
createOrReplaceTempView(input_df, "__current_exp_data")
## Helper function to run query, then save as table
transformation_helper = function(sql_query, destination_table) {
createOrReplaceTempView(sql(sql_query), destination_table)
}
## Transformation 0: Calculate max date, used for calculations later on
transformation_helper(
"SELECT 1L AS dummy_col, MAX(Date) max_date FROM __current_exp_data",
destination_table = "__max_date"
)
## Transformation 1: Make initial column calculations
transformation_helper(
"
SELECT
cId AS cId
, date_format(Date, 'yyyy-MM-dd') AS Date
, date_format(DateEntered, 'yyyy-MM-dd') AS DateEntered
, eId
, (CASE WHEN isnan(tSec) OR isnull(tSec) THEN 0 ELSE tSec END) AS tSec
, (CASE WHEN isnan(eSec) OR isnull(eSec) THEN 0 ELSE eSec END) AS eSec
, approx_count_distinct(eId) OVER (PARTITION BY cId) AS dc_eId
, COUNT(*) OVER (PARTITION BY cId, Date) AS num_rec
, datediff(Date, DateEntered) AS analysis_day
, datediff(max_date, DateEntered) AS total_avail_days
FROM __current_exp_data
CROSS JOIN __max_date ON __main_data.dummy_col = __max_date.dummy_col
",
destination_table = "current_exp_data_raw"
)
## Transformation 2: Drop row if Date is not valid
transformation_helper(
"
SELECT
cId
, Date
, DateEntered
, eId
, tSec
, eSec
, analysis_day
, total_avail_days
, CASE WHEN analysis_day == 0 THEN 0 ELSE floor((analysis_day - 1) / 7) END AS week
, CASE WHEN total_avail_days < 7 THEN NULL ELSE floor(total_avail_days / 7) - 1 END AS avail_week
FROM current_exp_data_raw
WHERE
isnotnull(Date) AND
NOT isnan(Date) AND
Date >= DateEntered AND
dc_eId == 1 AND
num_rec == 1
",
destination_table = "main_data"
)
cacheTable("main_data_raw")
cacheTable("main_data")
}
spark_sql_as_data_table = function(query) {
data.table(collect(sql(query)))
}
get_distinct_weeks = function() {
spark_sql_as_data_table("SELECT week FROM current_exp_data GROUP BY week")
}
Cell 2: Call helper function that triggers the long running task
library(data.table)
library(SparkR)
spark = sparkR.session(sparkConfig = list())
load_data_pq("/mnt/public-dir/file_0000000.parquet")
set.seed(1234)
get_distinct_weeks()
Long running stage DAG
Stats about long running stage
Logs
I trimmed it down, and show only entries that appeared multiple times below
BlockManager: Found block rdd_22_113 locally
CoarseGrainedExecutorBackend: Got assigned task 812
ExternalAppendOnlyUnsafeRowArray: Reached spill threshold of 4096 rows, switching to org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter
InMemoryTableScanExec: Predicate (dc_eId#61L = 1) generates partition filter: ((dc_eId.lowerBound#622L <= 1) && (1 <= dc_eId.upperBound#621L))
InMemoryTableScanExec: Predicate (num_rec#62L = 1) generates partition filter: ((num_rec.lowerBound#627L <= 1) && (1 <= num_rec.upperBound#626L))
InMemoryTableScanExec: Predicate isnotnull(Date#57) generates partition filter: ((Date.count#599 - Date.nullCount#598) > 0)
InMemoryTableScanExec: Predicate isnotnull(DateEntered#58) generates partition filter: ((DateEntered.count#604 - DateEntered.nullCount#603) > 0)
MemoryStore: Block rdd_17_104 stored as values in memory (estimated size <VERY SMALL NUMBER < 10> MB, free 10.0 GB)
ShuffleBlockFetcherIterator: Getting 200 non-empty blocks including 176 local blocks and 24 remote blocks
ShuffleBlockFetcherIterator: Started 4 remote fetches in 1 ms
UnsafeExternalSorter: Thread 254 spilling sort data of <Between 1 and 3 GB> to disk (3 times so far)
I have a DB PostgresSql Server v. 11.7 - which is used 100% for local development only.
Hardware: 16 cores CPU, 112 GB memory, 3TB m.2 SSD (It is running Ubuntu 18.04 - But I get the about the same speed at my windows 10 laptop when I run the exact same query locally on it).
The DB contains ~ 1500 DB table (of the same structure).
Every call to the DB is custom and specific - so nothing to cache here.
From NodeJS I execute a lot of simultaneously calls (via await Promise.all(all 1000 promises)) and afterwards make a lot of different calculations.
Currently my stats look like this (max connection set to the default of 100):
1 call ~ 100ms
1.000 calls ~ 15.000ms (15ms/call)
I have tried to change the different settings of PostgreSQL. For example to change the max connections to 1.000 - but nothing really seems to optimize the performance (and yes - I do remember to restart the PostgreSql service every time I make a change).
How can I make the execution of the 1.000 simultaneously calls as fast as possible? Should I consider to copy all the needed data to another in-memory database like Redis instead?
The DB table looks like this:
CREATE TABLE public.my_table1 (
id int8 NOT NULL GENERATED ALWAYS AS IDENTITY,
tradeid int8 NOT NULL,
matchdate timestamptz NULL,
price float8 NOT NULL,
"size" float8 NOT NULL,
issell bool NOT NULL,
CONSTRAINT my_table1_pkey PRIMARY KEY (id)
);
CREATE INDEX my_table1_matchdate_idx ON public.my_table1 USING btree (matchdate);
CREATE UNIQUE INDEX my_table1_tradeid_idx ON public.my_table1 USING btree (tradeid);
The simple test query - fetch 30 mins of data between two time-stamps:
select * from my_table1 where '2020-01-01 00:00' <= matchdate AND matchdate < '2020-01-01 00:30'
total_size_incl_toast_and_indexes 21 GB total table size --> 143 bytes/row
live_rows_in_text_representation 13 GB total table size --> 89 bytes/row
My NodeJS code looks like this:
const startTime = new Date();
let allDBcalls = [];
let totalRawTrades = 0;
(async () => {
for(let i = 0; i < 1000; i++){
allDBcalls.push(selectQuery.getTradesBetweenDates(tickers, new Date('2020-01-01 00:00'), new Date('2020-01-01 00:30')).then(function (rawTradesPerTicker) {
totalRawTrades += rawTradesPerTicker["data"].length;
}));
}
await Promise.all(allDBcalls);
_wl.info(`Fetched ${totalRawTrades} raw-trades in ${new Date().getTime() - startTime} ms!!`);
})();
I just tried to run EXPLAIN - 4 times on the query:
EXPLAIN (ANALYZE,BUFFERS) SELECT * FROM public.my_table1 where '2020-01-01 00:00' <= matchdate and matchdate < '2020-01-01 00:30';
Index Scan using my_table1_matchdate_idx on my_table1 (cost=0.57..179.09 rows=1852 width=41) (actual time=0.024..0.555 rows=3013 loops=1)
Index Cond: (('2020-01-01 00:00:00+04'::timestamp with time zone <= matchdate) AND (matchdate < '2020-01-01 00:30:00+04'::timestamp with time zone))
Buffers: shared hit=41
Planning Time: 0.096 ms
Execution Time: 0.634 ms
Index Scan using my_table1_matchdate_idx on my_table1 (cost=0.57..179.09 rows=1852 width=41) (actual time=0.018..0.305 rows=3013 loops=1)
Index Cond: (('2020-01-01 00:00:00+04'::timestamp with time zone <= matchdate) AND (matchdate < '2020-01-01 00:30:00+04'::timestamp with time zone))
Buffers: shared hit=41
Planning Time: 0.170 ms
Execution Time: 0.374 ms
Index Scan using my_table1_matchdate_idx on my_table1 (cost=0.57..179.09 rows=1852 width=41) (actual time=0.020..0.351 rows=3013 loops=1)
Index Cond: (('2020-01-01 00:00:00+04'::timestamp with time zone <= matchdate) AND (matchdate < '2020-01-01 00:30:00+04'::timestamp with time zone))
Buffers: shared hit=41
Planning Time: 0.097 ms
Execution Time: 0.428 ms
Index Scan using my_table1_matchdate_idx on my_table1 (cost=0.57..179.09 rows=1852 width=41) (actual time=0.016..0.482 rows=3013 loops=1)
Index Cond: (('2020-01-01 00:00:00+04'::timestamp with time zone <= matchdate) AND (matchdate < '2020-01-01 00:30:00+04'::timestamp with time zone))
Buffers: shared hit=41
Planning Time: 0.077 ms
Execution Time: 0.586 ms
I have a collection of slow query logs from RDS and I put them together into a single file. Trying to run it through pt-query-digest, following instructions here, but it reads the whole file as a single query.
Command:
pt-query-digest --group-by fingerprint --order-by Query_time:sum collider-slow-query.log > slow-query-analyze.txt
Output, showing that it only analyzed one query:
# Overall: 1 total, 1 unique, 0 QPS, 0x concurrency ______________________
Here's just a short snippet containing 2 queries from the file being analyzed to demonstrate there are many queries:
2019-05-03T20:44:21.828Z # Time: 2019-05-03T20:44:21.828954Z
# User#Host: username[username] # [ipaddress] Id: 19
# Query_time: 17.443164 Lock_time: 0.000145 Rows_sent: 5 Rows_examined: 121380
SET timestamp=1556916261;
SELECT wp_posts.ID FROM wp_posts LEFT JOIN wp_term_relationships ON (wp_posts.ID = wp_term_relationships.object_id) WHERE 1=1 AND wp_posts.ID NOT IN (752921) AND (
wp_term_relationships.term_taxonomy_id IN (40)
) AND wp_posts.post_type = 'post' AND ((wp_posts.post_status = 'publish')) GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 5;
2019-05-03T20:44:53.597Z # Time: 2019-05-03T20:44:53.597137Z
# User#Host: username[username] # [ipaddress] Id: 77
# Query_time: 35.757909 Lock_time: 0.000054 Rows_sent: 2 Rows_examined: 199008
SET timestamp=1556916293;
SELECT post_id, meta_value FROM wp_postmeta
WHERE meta_key = '_wp_attached_file'
AND meta_value IN ( 'family-guy-vestigial-peter-slice.jpg','2015/08/bobs-burgers-image.jpg','2015/08/bobs-burgers-image.jpg' );
Why isn't it reading all the queries? Is there a problem with my concatenation?
I had the same problem as well. It turns out that some additional timestamp is stored in the SQL (through this variable: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_log_timestamps).
This makes pt-query-digest not see individual queries. It should be easily fixed by either turning off the timestamp variable or remove the timestamp:
2019-10-28T11:17:18.412 # Time: 2019-10-28T11:17:18.412214
# User#Host: foo[foo] # [192.168.8.175] Id: 467836
# Query_time: 5.839596 Lock_time: 0.000029 Rows_sent: 1 Rows_examined: 0
use foo;
SET timestamp=1572261432;
SELECT COUNT(*) AS `count` FROM `foo`.`invoices` AS `Invoice` WHERE 1 = 1;
By removing the first timestamp (the 2019-10-28T11:17:18.412 part), it works again.