Extracting a range of keys from leveldb or redis - node.js

I would like to extract a range of keys from either leveldb or redis. For example i have the following key structure;
group:1/member:1
group:1/member:1/log:1
group:1/member:1/log:2
group:1/member:1/log:3
group:1/member:1/log:4
group:1/member:2
group:1/member:2/log:1
group:1/member:2/log:2
group:1/member:3
group:1/member:3/log:1
I would like to get all members(members:1, members:2, members:3) but i do not want their log entries to be included in results(there may be thousands of logs). What is the best approach to achieving this using a KV store like redis or leveldb?

For LevebDB, you can use the leveldb::Iterator to iterate the key space, and only keep the keys that match your pattern.
For Redis, you can use the SCAN command to scan the key space with a pattern.

Related

Amazon S3 - How to recursively rename files?

I'm trying to fetch my files via the s3.getObject() method in my node.js backend.
Trouble is, upon uploading the files to my bucket, I failed to replace special characters, dashes, and white-spaces. So, any files that have a Key value of (e.g., a Key with the value of 10th Anniversary Party (Part 1) 1-23-04 has an endpoint of 10th+Anniversary+Party+(Part+1)+1-23-04).
This becomes troublesome when trying to encode the URI for fetching. I'd like to replace all dashes, white-space, and special chars with a simple underscore. I've seen some possible conventions using the aws-cli, however I am unsure what the best command for this is. Any advice would be greatly appreciated.
You could write a program that:
Lists the contents of the bucket
Calls CopyObject() to copy the object to a new Key
Calls DeleteObject() to delete the previous copy
Or, you could take advantage of the fact that the AWS CLI offers a aws s3 mv command that will Copy + Delete for you.
I often simply create an Excel spreadsheet with the existing names, and a formula for determining what name I'd like. Then, I create a third column with:
aws s3 mv [Column 1] [Column 2]
Use Copy Down on the rows to get all the mv commands. Then, copy the column of commands, paste them into the command-line and it will rename all the objects in Amazon S3! (Test with 1-2 lines first, in case there is an error in the formula.)
This might seem primitive, but it's a very quick way to make the changes.

SQL Azure Always Encrypted column - how to change column's size when encrypted?

I have a SQL Azure Db with Always Encrypted column (with Key Vault) which is VARCHAR(6) and now business required to change its size. How can I do this? I haven't found anything in the docs or anywhere else.
One thought would be to decrypt the column and encrypt it again. Is there an easy way to do this?
This is actually possible and really easy:
ALTER TABLE [LOGS].[SOMETABLE]
ALTER COLUMN
[CARDNUM] [varchar](19) COLLATE Latin1_General_BIN2 ENCRYPTED WITH (COLUMN_ENCRYPTION_KEY = [CEK_Auto1], ENCRYPTION_TYPE = Deterministic, ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256') NULL
GO
And it works like a charm.

Verifyable logfile at customer site

We want to create a logfile at customer site where
the customer is able to read the log (plain text)
we can verify at our site that the log file isn't manipulated
A few hundred bytes of unreadable data is okay. But some customers do not send us files where they can't verify that they do not contain sensible data.
The only reasonable option I see so far is to append a cryptographic checksum (e.g. SHA256(SECRET_VALUE + "logtext")). The SECRET_VALUE would be something hardcoded which is plain "security through obscurity". Is there any better way?
We use the DotNet-library and I do not want to implement any crypto algorithm by hand if that matters.
You can use standard HMAC algorithm with a secret key to perform the checksum.
Using a secret key prevents in a simple way that the checksum can be regenerated directly. A hardcoded key could be extracted from code, but for your use case I think is enough
The result is a binary hash. To insert it into the text file encode the value as hexadecimal or base64, and ensure you are able to revert the process in server side so you can calculate the hash again with the original file.
You could use also a detached hash file to avoid modifying the log file
Target
customer readable logfiles
verifyable by our side
minimum of binary data
must work offline
Options
Public-Private-key-things... (RSA, ...)
would be secure
but only binary data
Add a signature
We are not the first ones with that idea ( https://en.wikipedia.org/wiki/Hash-based_message_authentication_code )
DotNet supports that ( System.Security.Cryptography.HMACSHA256 )
Key must be stored somewhere ... in source
Even with obfuscation: not possible to do so securely
Trusted Timestamping
again: we are not first ( https://en.wikipedia.org/wiki/Trusted_timestamping )
needs connection to "trusted third party" (means: a web service)
Build Hash + TimeStamp -> send to third party -> sign the data (public-private-key stuff) -> send back
Best option so far
Add a signature with HMAC
Store the key in native code (not THAT easy to extract)
Get code obfuscation running and build some extra loops in C#
Every once in a while (5min?) put a signature into log AND into windows application log
application log is at least basically secured against modification (read only)
and it's collected by the our error report
easy to oversee by customer (evil grin)

select all and truncate redis database

I'm looking for something similar to BLPOP, but instead of element I want to get them all run running over them in a loop.
It means that I want to get all the records of redis collection, and truncate it.
Consider using a LUA script to do the LRANGE+DEL atomically.
Or use RENAME to move the list to a temporary key which you will use to process the data.
RENAME yourlist temp-list
LRANGE temp-list 0 -1
... process the list
DEL temp-list

Using geospatial commands in rethinkdb with changefeed

right now I have a little problem:
I want to use geospatial commands (like getIntersecting) together with the changefeed feature of rethinkdb but I always get:
RqlRuntimeError: Cannot call changes on an eager stream in: r.db("Test").table("Message").getIntersecting(r.circle([-117.220406,32.719464], 10, {unit: 'mi'}), {index: 'loc'})).changes()
the big question is: Can I use getIntersecting with the changes() (couldn't find anything related to that in the docs btw ...) or do I have to abandon the idea of using rethinkdb geospatial features and just use change() to get ALL added or changed documents and do the geospatial stuff outside of rethinkdb?
You can't use .getIntersecting with .changes, but you can write essentially the same query by adding a filter after .changes that checks if the loc is within the circle. While .changes limits what you can write before the .changes, you write basically any query after the .changes and it will work.
r.table('Message')
.changes()
.filter(
r.circle([-117.220406,32.719464], 10, {unit: 'mi'})
.intersects(r.row('new_val')('loc'))
)
Basically, every time there is a change in the table the update will get push to the changefeed, but it will get filtered out. Since there is not a lot of support for geospatial and changfeeds, this is more or less how you would need to integrate the two.
In the future, changefeeds will be much broader and you'll be able to write basically any query with .changes at the end.

Resources