Ubuntu 14.04 arbtt-stats index to large error - linux

I've recently installed arbtt which seems to be an intersting, rule based, automatic time tracker. http://arbtt.nomeata.de/#what
I've got it working for the most part, but after 30 minutes or so of gathering stats, I end up with the following error.
Processing data [=>......................................................................................................................................................................................] 1%
arbtt-stats: Prelude.(!!): index too large
Does anyone have any suggestions on ways I can troubleshoot this issue, or better yet, solve it? I have 0 experience with the coding language used to create the rules (Haskell I believe). All I've done to this point is follow the documentation as closely as possible.
This error ultimately renders the tool useless since it doesn't gather data any longer than 30 minutes. To fix it, I have to delete the log and start from scratch. I'm primarily interested in the notion of having a customizable, rule based time tracker but I'm by no means tied to using arbtt.
Based on the comments below, I'm including some more information below.
When I try to run arbtt-recover I get a long list of errors that look like this. All of them seem to be related to an Unsupported TimeLogEntry.
Trying at position 1726098.
Failed to read value at position 1726098:
Unsupported TimeLogEntry version tag 0
As for the configuration file, here is what I have so far.
$idle > 30 ==> tag inactive,
-- A rule that matches on a list of strings
current window $program == ["Chrome", "Firefox"] ==> tag Web,
current window $program == ["skype"] ==> tag Skype,
current window $program == ["jetbrains-phpstorm"] ==> tag PhpStorm,
( current window $title =~ m!Inbox! ||
current window $title =~ m!Outlook! ) ==> tag Emails,
( current window $title =~ m!AdWords! ||
current window $title =~ m!Analytics! ) ==> tag Adwords,
It goes on further, but I'm fairly confident I've followed this same syntax for all other lines. The rest of the lines are following the same format but are project/client specific for me. If required, I'm happy to include the rest of the file.

As discussed in the comments: This is a case of a corrupt ~/.arbtt/capture.log. You can usually fix this by
running arbtt-recover
and then moving ~/.arbtt/capture.log.recovered to ~/.arbtt/capture.log.
The second manual step is required to avoid accidentially deleting too much data. You can test that the recovered file is better by making arbtt-stats using the recovered file by passing --logfile=~/.arbtt/capture.log.recovered to it.
Data corruption happens for example when there is an unclean shutdown, or other undetermined reasons. But the log file format is such that even after a corruption (e.g. a partial write of one sample), further samples will be written correctly and should be picked up by arbtt-recover, so you did not lose more than a few samples.

Related

How Do I resolve "Illuminate\Queue\InvalidPayloadException: Unable to JSON encode payload. Error code: 5"

Trying out the queue system for a better user upload experience with Laravel-Excel.
.env was been changed from 'sync' to 'database' and migrations run. All the necessary use statements are in place yet the error above persists.
The exact error happens here:
Illuminate\Queue\Queue.php:97
$payload = json_encode($this->createPayloadArray($job, $queue, $data));
if (JSON_ERROR_NONE !== json_last_error()) {
throw new InvalidPayloadException(
If I drop ShouldQueue, the file imports perfectly in-session (large file so long wait period for user.)
I've read many stackoverflow, github etc comments on this but I don't have the technical skills to deep-dive to fix my particular situation (most of them speak of UTF-8 but I don't if that's an issue here; I changed the excel save format to UTF-8 but it didn't fix it.)
Ps. Whilst running the migration, I got the error:
SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes (SQL: alter table `jobs` add index `jobs_queue_index`(`queue`))
I bypassed by dropping the 'add index'; so my jobs table is not indexed on queue but I don't feel this is the cause.
One thing you can do when looking into json_encode() errors is use the json_last_error_msg() function, which will give you a bit more of a readable error message.
In your case you're getting a '5' back, which is the JSON_ERROR_UTF8 error code. The error message back for this is a slightly more informative one:
'Malformed UTF-8 characters, possibly incorrectly encoded'
So we know it's encountering non-UTF-8 characters, even though you're saving the file specifically with UTF-8 encoding. At first glance you might think you need to convert the encoding yourself in code (like this answer), but in this case, I don't think that'll help. For Laravel-Excel, this seems to be a limitation of trying to queue-read .xls files - from the Laravel-Excel docs:
You currently cannot queue xls imports. PhpSpreadsheet's Xls reader contains some non-utf8 characters, which makes it impossible to queue.
In this case you might be stuck with a slow, non-queueable option, or need to convert your spreadsheet into a queueable format e.g. .csv.
The key length error on running the migration is unrelated. It has been around for a while and is a side-effect of using an older version of MySQL/MariaDB. Check out this answer and the Laravel documentation around index lengths - you need to add this to your AppServiceProvider::boot() method:
Schema::defaultStringLength(191);

Regular Expressions and SQL Server Error Logs - All false results

Ok, I have done my searching and I have tried many things. I think it is time to put my question here:
I have been working on taking in other user's SQL Server error logs, parsing out the rows into columns, then bulk inserting the data 1000 at a time. I troubleshoot SQL Server for other people so sp_readerrorlog will only show me my local instance. Finding root cause involves 4 sets of logs (SQL Server, Application Event, System Event, and get-clusterlog outputs and matching up timestamps. A fast load into SQL Server along with the ability to pull the exact timeframe needed will shorten my time spent staring at log files.
I am currently bottlenecked in testing the rows with a regular expression, which does work if I feed it data myself:
def sqlrowmatch(row):
pattern = re.compile(r'\d\d\d\d-\d\d-\d\d\s\d\d:\d\d:\d\d.\d\d')
if pattern.search(row):
return True
else:
return False
given any string that matches above (1111-11-11 11:11:11.11) will return as true. The idea is if in a SQL Server Error Log, if this is matched, then it is a separate entry. this will allow memory graphs, deadlock graphs, and dumps to all be grouped in one entry as opposed to being split over several lines.
However, if I point it at one of the SQL Error Logs, there seems to be extra characters. This is giving re.match and re.show a hard time finding a match. If I load any line in this function,sqlrowmatch(), it reports back false for all rows.
ÿþ <-- this appears to be the first 2 characters at the first line. re.search just doesn't even find it anywhere in the in the different elements.
False is what is returned if I put the function in with the 'with open' as statement:
with open(file, 'r') as sqllog:
for line in sqllog:
print(sqlrowmatch(line))
the first line should always be true if sqlrowmatch() is used.
2018-10-13 22:40:09.41 Server Microsoft SQL Server 2016 (SP2-CU2-GDR) (KB4458621) - 13.0.5201.2 (X64)
So I am lost and my current project is at a halt. Perhaps some seasoned insight from this group can get me going again.
TIA
Interesting enough, I found my answer here: Opening huge text file, unicode issue
open should be done with encoding='utf-16'
It now matches appropriately

Linux Date not showing the date value sometimes

I have defined a variable inside one of the shell script to create the file name with date value in it.
I used "date +%Y%m%d" command to insert the current date which was defined in date_val variable.
And I have defined the filename variable to have "${path}/sample_${date_val}.txt
For few days it was creating the file name properly as /programfiles/sample_20180308.txt
But today the filename was created without date as /programfiles/sample_.txt
When I try to execute the command "date +%Y%m%d" in linux, it is returning the correct value - 20180309.
Any idea why the filename was created without the date value ??? . I did not modify anything in my script too. So wondering what might have gone wrong.
Sample excerpt of my script is given below for easy understanding :
EDITED
path=/programfiles
date_val=$(date +%Y%m%d )
file_name=${path}/sample_${date_val}.txt
Although incredibly unlikely, it's certainly possible for date to fail, based on the source code. Under the covers, it calls either clock_gettime() or gettimeofday(), both of which can fail.
The date program will also refuse to output anything to standard output if the date from either of those two functions is out of range during the call to (which is possible if they fail).
It's also possible that the date program could "disappear" for various reasons, such as actually being hidden or permissions changed, or a shortage of resources like file handles when attempting to open the executable.
As mentioned, all these possibilities are a stretch, unlikely to happen in the real world.
If you want to handle the case where you get inadequate output from date, you can simply try until you get a valid one, something like (with the possibility of adding some limit to detect if it's never any good):
todaysDate="$(date +%Y%m%d)"
while [[ ! $x =~ ^[0-9]{8}$ ]] ; do
sleep 1
todaysDate="$(date +%Y%m%d)"
done
# todaysDate now guaranteed to be eight digits.

How to check the available free space in Haxe?

I don't find a way to check the free space available in a device using Haxe, Openfl, Lime or another library.
I would like to avoid download data that will exceed the size recommended for an app in each device.
What do you do to check that?
Try creating a file of that size! Then either delete it or reopen and write (not append) over its contents.
I don't know whether all platforms Haxe supports will work fine with this trick, but this algorithm is reported to work in many places and languages (I personally tested it in Ruby and saw the same suggestion for C++/.NET). To check whether X bytes of disk space are available:
open a new file for writing
seek X-1 bytes from the beginning
write a byte of data (whatever you want, 0, 42...)
close the file (probably unrelated to the task at hand, but don't forget to do that anyway)
If there's insufficient disk space, you'll likely get an exception at some point in this algorithm. You'll have to find out what errors to expect and process them properly.
Using ihx I've found this is working and requires nothing but Haxe Standard Library:
haxe interactive shell v0.3.4
type "help" for help
>> import sys.io.*;
>> var f = File.write('loca', true)
sys.io.FileOutput : { __f => #abstract }
>> f.seek(39999, FileSeek.SeekBegin)
Void : null
>> f.writeByte(0)
Void : null
>> f.close()
Void : null
After these manipulations, I had a file named loca of exactly 40000 bytes in my working directory.
By the way, be careful when doing things like these in ihx since it re-runs the entire session with the last entered line appended each time.
Ongoing experimentation:
However, when there's insufficient disk space, it may not fail with errors. In this case you'll have to check the real size with sys.FileSystem.stat(path).size. And don't forget to delete the file if there's not enough space.

How can I make cucumber run all the steps (not skip them) even if one of them fails?

I am using Cucumber with RubyMine, and I have a scenario with steps that verify some special controls from a form (I am using cucumber for automation testing). The controls don't have anything to do with each other, and there is no reason for the steps to be skipped if one in front of them fails.
Does anyone know what configurations or commands should I use to run all the steps in a scenario even if they all fail?
I think the only way to achieve desired behavior (which is quite uncommon) is to define custom steps and catch exceptions in it yourself. According to cucumber wiki step is failed if it raises an error. Almost all default steps raise error if they can't find or interact with an element on the page. If you'll catch this exceptions the step will be marked as passed, but in rescue you can provide custom output. Also I recommend you to carefully define exceptions you want to catch, I think if you're Ok if selenium can't find an element on the page rescue only from ElementNotFound exceptions, don't catch all exceptions.
I've seen a lot of threads on the Web about people wanting to continue steps execution if one failed.
I've discussed with Cucumber developers: they think this is a bad idea: https://groups.google.com/forum/#!topic/cukes/xTqSyR1qvSc
Many times, scenarios can be reworked to avoid this need: scenarios must be split into several smaller and independent scenarios, or several checks can be aggregated into one, providing a more human scenario and a less script-like scenario.
But if you REALLY need this feature, like our project do, we've done a fork of Cucumber-JVM.
This fork let you annotate steps so that when they fail with a determined exception, they will let let next steps execute anyway (and the step itself is marked as failed).
The fork is available here:
https://github.com/slaout/cucumber-jvm/tree/continue-next-steps-for-exceptions-1.2.4
It's published on the OSSRH Maven repository.
See the README.md for usage, explanation screenshot and Maven dependency.
It's only available for the Java language, tough: any help is welcome to adapt the code to Ruby, for instance. I don't think it will be a lot of work.
The question is old, but hopefully this will be helpful. What I'm doing feels kind of "wrong", but it works. In your web steps, if you want to keep going, you have to catch exceptions. I'm doing that primarily to add helpful failure messages. I'm checking a table full of values that are identified in Cucumber with a table having a bunch of rows like:
Then my result should be:
| Row Identifier | Column Identifier | Subcolum Identifier | $1,247.50 |
where the identifiers make sense in the application domain, and name a specific cell in the results table in a human-friendly way. I have helpers that convert the human identifiers to DOM IDs, which are used to first check whether the row I'm looking for exists at all, then look for the specific value in a cell in that row. The default failure message for a missing row is clear enough for me (expected to find css "tr#my_specific_dom_id" but there were no matches). But the failure message for checking specific text in a cell is completely unhelpful. So I made a step that catches the exception and uses the Cucumber step info and some element searching to get a good failure message:
Then /^my application domain results should be:$/ do |table|
table.rows.each do |row|
row_id = dom_id_for(row[0])
cell_id = dom_id_for(row[0], row[1], row[2])
page.should have_css "tr##{row_id}"
begin
page.should have_xpath("//td[#id='#{cell_id}'][text()=\"#{row[3].strip.lstrip}\"]")
rescue Capybara::ExpectationNotMet => exception
# find returns a Capybara::Element, native returns a Selenium::WebDriver::Element
contents = find(:xpath, "//td[#id='#{cell_id}']").native.text
puts "Expected #{ row[3] } for #{ row[0,2].join(' ') } but found #{ contents } instead."
#step_failures_were_rescued = true
end
end
end
Then I define a hook in features/support/hooks.rb like:
After do |scenario|
unless scenario.failed?
raise Capybara::ExpectationNotMet if #step_failures_were_rescued
end
end
This makes the overall scenario fail, but it masks the step failure from Cucumber, so all the step results are green, including the ones that aren't right. You have to see the scenario failure, then look back at the messages to see what failed. This seems kind of "bad" to me, but it works. It's WAY more convenient in my case to get the expected and found values listed in a domain-friendly context for the whole table I'm checking, rather than to get a message like "I looked for "$123.45" but I couldn't find it." There might be a better way to do this using the Capybara "within" method. This is the best I've come up with so far though.

Resources