My objective is to detect actions performed by users that resulted in an access denied or unauthorized error using activity logs.
To detect error I use the field "resultType" field. When it is "Failure", I know that this is an error record. I want to go one step further and filter those which are "access denied" or "unauthorized" error records.
I have considered following fields so far as potential candidates for the same, however haven't found any relevant information in them.
resultDescription
properties.statusCode
Following is the sample schema of the activity log we get on our end. The schema is such because we stream our activity log to a storage account(https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/activity-log-schema#schema-from-storage-account-and-event-hubs)
When streaming the Azure Activity log to a storage account or event hub, the data >?>follows the resource log schema.
{
"callerIpAddress" : "0.0.0.0",
"resourceGroup" : "group",
"resourceId" : "dummy",
"level" : "Information",
"production" : false,
"operationName" : "MICROSOFT.WEB/DUMMY",
"ingestTime" : "time",
"resultSignature" : "Succeeded.OK",
"accountId" : "dummyId",
"identity" : {
"authorization" : {
"evidence" : {
"roleAssignmentScope" : "group",
"role" : "dummy",
"roleDefinitionId" : "dummy",
"roleAssignmentId" : "dummy",
"principalId" : "dummy",
"principalType" : "dummy"
},
"scope" : "dummy",
"action" : "dummy"
},
"claims" : {
"http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier" : "dummy",
"appid" : "dummy",
"http://schemas.microsoft.com/identity/claims/objectidentifier" : "dummy"
}
},
"customerID" : "dummy",
"correlationId" : "dummy",
"time" : "dummy",
"category" : "dummy",
"resultType" : "Failure",
"resultDescription": "dummy",
"durationMs" : "dummy",
"properties" : {
"eventCategory" : "Administrative",
"statusCode" : "OK"
}
}
Related
I am trying to generate a test AVRO file from a collection of objects represented by generated classes (TestAggregate.java, TestTuple.java). I used avro-tools-1.10.2.jar to generate those classes from this AVRO schema (dataset.avsc):
{
"type" : "record",
"name" : "TestAggregate",
"namespace" : "com....",
"fields" : [ {
"name" : "uuid",
"type" : "string"
}, {
"name" : "bag",
"type" : {
"type" : "array",
"items" : {
"type" : "record",
"name" : "TestTuple",
"fields" : [ {
"name" : "s",
"type" : "int"
}, {
"name" : "n",
"type" : "int"
}, {
"name" : "c",
"type" : "int"
}, {
"name" : "f",
"type" : "int"
} ]
}
},
"aliases" : [ "bag" ]
} ]
}
When I try to create an Encoder using
Encoder<TestAggregate> datasetEncoder = Encoders.bean(TestAggregate.class); , it throws an Exception:
Exception in thread "main" java.lang.UnsupportedOperationException: Cannot have circular references in bean class, but got the circular reference of class class org.apache.avro.Schema...
There is no circular reference in those generated files (or schema) as far as I can tell.
I am using Spark release 3.2.1.
Any ideas on how to resolve it?
I'm not sure you need an encoder (or the compiled class)
Take the AVSC text itself, and you can get a Schema like so
SchemaConverters.toSqlType(new Schema.Parser().parse(avroSchema))
Then this can be given to the spark-sql from_avro function.
I want to make a app. -> When user a blocks user b user a immediately can't see user b's posts or comments
My structure data doesn't have Users node. Because I'm making app simply.
"Comment" : {
"-MptDdCq-j5JAuqgHkGt" : {
"-Mq8JFEZ5gdJhYIpQ4Vm" : {
"content" : "1",
"timestamp" : 1638686327979,
"uid" : "",
"uimg" : "",
"uname" : ""
}
}
"Posts" : {
"-Mqsc-nqUAOWNb9_kSvl" : {
"description" : "",
"picture" : "",
"postKey" : "-Mqsc-nqUAOWNb9_kSvl",
"timeStamp" : 1639480036785,
"title" : "",
"userId" : "",
"userPhoto" : ""
},
"-MqshZSrx0aO8OPMGC2M" : {
"description" : "",
"picture" : "",
"postKey" : "-MqshZSrx0aO8OPMGC2M",
"timeStamp" : 1639481493534,
"title" : "",
"userId" : "",
"userPhoto" : ""
}
Here is my structure. Can I make a blocking function without Users node?
I learned solution by How to block users on Firebase in a social media app? for iOS
But that solution needs Users node.. Is there no other way?
I updated BlockUser node.
Here is a new structure.
"BlockUser" : {
"k1kn0JF5idhrMzuw46GarEIBgPw2" : "OMBueDmbXdQhePnVaVH2teyOGzl2",
"kVAREcjmrHgLlvOldJetBCoiLx93" : "kVAREXdQhePnVaVH2JetBCoiLx93"}
left part is user id who blocks and right part is user id who have been blocked.
Then can I make block user function? Using firebase rules.
My firebase rule
"rules": {
".read": "auth.uid != null",
".write": "auth.uid != null"}
I have data in a collection ex:"jobs". I am trying to copy specific data from "jobs" after every 2 hours to a new collection (which may not exist initially) and also add a new key to the copied data.
I have been trying with this query to copy the data:
db.getCollection("jobs").aggregate([{ $match: { "job_name": "UploadFile", "created_datetime" : {"$gte":"2021-08-18 12:00:00"} } },{"$merge":{into: {coll : "reports"}}}])
But after this, the count in "reports" collection is 0. Also, how can I update the documents (with an extract key "report_name") without using an extra updateMany() query?
The data in jobs collection is as shown:
{
"_id" : ObjectId("60fa8e8283dc22799134dc6f"),
"job_id" : "408a5654-9a89-4c15-82b4-b0dc894b19d7",
"job_name" : "UploadFile",
"data" : {
"path" : "share://LOCALNAS/Screenshot from 2021-07-23 10-34-34.png",
"file_name" : "Screenshot from 2021-07-23 10-34-34.png",
"parent_path" : "share://LOCALNAS",
"size" : 97710,
"md5sum" : "",
"file_uid" : "c4411f10-a745-48d0-a55d-164707b7d6c2",
"version_id" : "c3dfd31a-80ba-4de0-9115-2d9b778bcf02",
"session_id" : "c4411f10-a745-48d0-a55d-164707b7d6c2",
"resource_name" : "Screenshot from 2021-07-23 10-34-34.png",
"metadata" : {
"metadata" : {
"description" : "",
"tag_ids" : [ ]
},
"category_id" : "60eed9ea33c690a0dfc89b41",
"custom_metadata" : [ ]
},
"upload_token" : "upload_token_c5043927484e",
"upload_url" : "/mnt/share_LOCALNAS",
"vfs_action_handler_id" : "91be4282a9ad5067642cdadb75278230",
"element_type" : "file"
},
"user_id" : "60f6c507d4ba6ee28aee5723",
"node_id" : "syeda",
"state" : "COMPLETED",
"priority" : 2,
"resource_name" : "Screenshot from 2021-07-23 10-34-34.png",
"group_id" : "upload_group_0babf8b7ce0b",
"status_info" : {
"progress" : 100,
"status_msg" : "Upload Completed."
},
"error_code" : "",
"error_message" : "",
"created_datetime" : ISODate("2021-07-23T15:10:18.506Z"),
"modified_datetime" : ISODate("2021-07-23T15:10:18.506Z"),
"schema_version" : "1.0.0",
}
Your $match stage contains a condition which takes created_datetime as string while in your sample data it is an ISODate. Such condtion won't return any document, try:
{
$match: {
"job_name": "UploadFile",
"created_datetime": {
"$gte": ISODate("2021-07-01T12:00:00.000Z")
}
}
}
Mongo Playground
i would like to get all artists information like rank,active listeners,followers and increment today details. Is there any possiblity tool to get scraped whenever i want.
The Spotify API has a Get an Artist endpoint that gives you access to things like popularity, images, followers, genres, etc. It is documented here: https://developer.spotify.com/web-api/get-artist/
A response may look like this:
{
"external_urls" : {
"spotify" : "https://open.spotify.com/artist/0OdUWJ0sBjDrqHygGUXeCF"
},
"followers" : {
"href" : null,
"total" : 306565
},
"genres" : [ "indie folk", "indie pop" ],
"href" : "https://api.spotify.com/v1/artists/0OdUWJ0sBjDrqHygGUXeCF",
"id" : "0OdUWJ0sBjDrqHygGUXeCF",
"images" : [ {
"height" : 816,
"url" : "https://i.scdn.co/image/eb266625dab075341e8c4378a177a27370f91903",
"width" : 1000
}, {
"height" : 522,
"url" : "https://i.scdn.co/image/2f91c3cace3c5a6a48f3d0e2fd21364d4911b332",
"width" : 640
}, {
"height" : 163,
"url" : "https://i.scdn.co/image/2efc93d7ee88435116093274980f04ebceb7b527",
"width" : 200
}, {
"height" : 52,
"url" : "https://i.scdn.co/image/4f25297750dfa4051195c36809a9049f6b841a23",
"width" : 64
} ],
"name" : "Band of Horses",
"popularity" : 59,
"type" : "artist",
"uri" : "spotify:artist:0OdUWJ0sBjDrqHygGUXeCF"
}
You can request additional features in the Web API issue tracker here: https://github.com/spotify/web-api/issues
I have documents that contains a object which the attributes are editable (add/delete/edit) in runtime.
{
"testIndex" : {
"mappings" : {
"documentTest" : {
"properties" : {
"typeTestId" : {
"type" : "string",
"index" : "not_analyzed"
},
"createdDate" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"designation" : {
"type" : "string",
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed"
}
}
},
"id" : {
"type" : "string",
"index" : "not_analyzed"
},
"modifiedDate" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"stuff" : {
"type" : "string"
},
"suggest" : {
"type" : "completion",
"analyzer" : "simple",
"payloads" : true,
"preserve_separators" : true,
"preserve_position_increments" : true,
"max_input_length" : 50,
"context" : {
"typeTestId" : {
"type" : "category",
"path" : "typeTestId",
"default" : [ ]
}
}
},
"values" : {
"properties" : {
"Att1" : {
"type" : "string"
},
"att2" : {
"type" : "string"
},
"att400" : {
"type" : "date",
"format" : "dateOptionalTime"
}
}
}
}
}
}
}
}
The field values is a object that can be edited throug typeTest, so if I change something in typeTestit should be reflected here. If i create a new field theres no problem, but it should be possible to edit or delete existing fields in typeTest. For example If I delete values.att1 all documentTest should lose these, as well as the mapping should be updated.
For what I saw, we cannot do these without reindexing. So for now my solution is to remove the fields in elastic search just like mentioned in this question and have a worker do the reindexing time to time if needed.
This does not seems to me a "solution". Is there a better way to have document of this type in elasticsearch? with this flexibility without having to reindex time to time?
You can use the Update API to delete, add or modify a field.
The issue is docs are immutable in elasticsearch, so when you make some changes with the update API it is executed in a manner mark as deleted to old one and add a new one with the updates.
The deletion and the creating the new documents is transparent to you, so you do not have to reindex or do any other thing. Down side is if you are planning to modify very large numbers of documents (like an update query to modify 5mil documents.) it will be very I/O intensive for the nodes.
BTW, this is also applies to deletions