I'm working with a CSV export of identity data containing ~22000 records. One of the fields included is titled 'ExtendedAttributes' and each cell in the column contains a quote bound string of comma separated Key:Value pairs. Each record in the file has an arbitrary number of extended attributes (up to around 50). My ultimate objective is to expand these extended attributes into their own columns in Excel (2016). I already have solutions for the expansions into columns from other data using formulae, simple VBA and most recently Power Query based approaches.
However, my previous solutions have all been based on the Key:Value pairs being simple to delimit. In this export, the ExtendedAttributes field has:
Value data that may contain unescaped/unquoted commas. e.g.
"Key1: Value1, name: surname, forename, Key2: Value2, ... "
Keys that may contain multiple comma separated values, which are also unquoted/unescaped. e.g.
"Key1: Value1, emailAlias: alias1#domain, alias2#domain, alias3#domain, Key2: Value2, ... "
My usual approach to this, where Key:Value pairs don't have these problems would be to delimit using commas to break it into the key value pairs, transpose the data into rows, then delimit using the colon to populate my new columns and their values as described here in the PowerBI community pages
This doesn't work here because delimiting using a comma breaks the values.
Is there a straightforward way to parse this into the constituent Key:Value pairs using (ideally) Power Query? Happy to also go with VBA or formula based solutions.
My instinctive approach would be to try and identify substrings containing a colon and prepend them with a unique character, which can then be used as a delimiter. (It's not impossible that the data may also include unescaped colons, but I'm happy to assume that it doesn't) But recognise that this may be a needlessly complex approach and I'm unsure how best to do it.
I'm happy to keep values with multiple comma separated items as a single unit (A problem for me to deal with later).
For the example data:
"Key1: Value1, name: surname, forename, emailAlias: alias1#domain, alias2#domain, alias3#domain, Key2: Value2, ... "
I'd like to end up with something that lets me treat the data like this, using maybe a ! as an example unique character that I could then use as a delimiter:
"Key1: Value1!name: surname, forename!emailAlias: alias1#domain, alias2#domain, alias3#domain!Key2: Value2!..."
I don't have access to the original data (vendor controlled system) and have limited data processing tools on my corporate desktop (Excel 2016, VBA, PQ).
Appreciate any help.
In Power Query, you can define a function Partition as follows:
let
Output = (str as text, sep as text) as text =>
Text.RemoveRange(
Text.Replace(
Text.Combine(
List.Transform(
Text.Split(str, " "),
each if Text.Contains(_, ":") then sep & _ else _
),
" "
), ", " & sep, sep
),
0, Text.Length(sep)
)
in
Output
Example text transformation using separator !
Starting text:
Key1: Value1, name: surname, forename, emailAlias: alias1#domain
Split the string based on spaces into a list
Key1:
Value1,
name:
surname,
forename,
emailAlias:
alias1#domain
Prepend any list items containing : with separator !
!Key1:
Value1,
!name:
surname,
forename,
!emailAlias:
alias1#domain
Combine the list back into a string
!Key1: Value1, !name: surname, forename, !emailAlias: alias1#domain
Replace , ! with !
!Key1: Value1!name: surname, forename!emailAlias: alias1#domain
Remove the first separator
Key1: Value1!name: surname, forename!emailAlias: alias1#domain
Once you have this function defined, you can call it in a column transformation that would look something like
= Table.TransformColumns(#"Prev Step", {{"ColName", each Partition(_,"!") , type text}})
You do not say anything about the file records not being counted in the "ExtendedAttribute field" category... I prepared a function able to separate that area which you put in discussion. Please, use the next code:
Function separateKeys(x As String, sep As String) As String
Dim arr1, arr2, i As Long, k As Long
arr1 = Split(x, ": ")
ReDim arr2(UBound(arr1))
For i = 0 To UBound(arr1) - 1
If arr1(i + 1) = arr1(UBound(arr1)) Then Exit For
arr2 = Split(arr1(i + 1), " ")
arr2(UBound(arr2) - 1) = Replace(arr2(UBound(arr2) - 1), ",", sep)
arr1(i + 1) = Join(arr2, " ")
Next
separateKeys = Replace(Join(arr1, ":"), sep & " ", sep)
End Function
The above function can (probably) be adapted in a way to skip calculations for rest of the file, or also transform each comma in the sep character (simple using Replace).
In order to test the above function, please use the next testing Sub:
Sub testSepKeys()
Dim x As String, sep As String
sep = "|" 'you can try something else, but improbable to appear in the processed text
x = "Key1: Value1, name: surname, forename, emailAlias: alias1#domain, alias2#domain, alias3#domain, Key2: Value2, Value3, key3: Val1, Val2"
Debug.Print separateKeys(x, sep)
End Sub
Like global way of working, I would suggest splitting the file on line separator, then process all the array elements (lines) using the above (adapted) function and finally join it on the line separator.
The newly created file should be open using Workbooks.OpenText, DataType:=xlDelimited, OtherChar:=sep.
Please, test the above function and send some feedback.
I want to compare two textboxes with data in a datatable and use this comparison operation to filter the datetable.
For example: I want to show all data (rows and columns) that have value x in which:
textbox1.text>x>textbox1.text
I have used "Like" operator inside string format to get the value that matches the value in the text-box completely but I could not do the required range filtering operation
Here is my code related to the specified question:
dv.RowFilter = string.Format("Type Like '%{0}%' and Gain Like" +
"'%{1}%'" +
"and Year Like'%{2}%' and MotorPower Like '%{3}%'" +
"and Profit Like '%{4}%'", textBoxType .Text,textBoxGain.Text
, textBoxYear.Text, textBoxBiggerthan.Text, textBoxKar.Text);
dataGridView1.DataSource = dv;
I have another input textbox called textBoxSmallerthan.Text
and I want to make my range for MotorPower column in datatable (datagridview) between textBoxBiggerthan.Text and textBoxSmallerthan.Text
The documentation here shows the numbers do not need be wrapped with single quote makers. So the format is:
Columnname < Number
So the final filter should be something like this:
dv.RowFilter = string.Format("Type Like '%{0}%' and Gain Like" +
"'%{1}%'" +
"and Year Like'%{2}%' and MotorPower > {3} and MotorPower < {4}" +
"and Profit Like '%{4}%'", textBoxType .Text,textBoxGain.Text
, textBoxYear.Text, textBoxSmallerthan.Text, textBoxBiggerthan.Text, textBoxKar.Text);
dataGridView1.DataSource = dv;
I need to export all my users with their webform submitted data to excel file.I can export users, but how do this with related webforms I dont now. Please, help me.
How about this:
select CONCAT(GROUP_CONCAT(CONCAT('"', sd.data, '"' )), ', "',u.uid, '","', u.name, '", "', u.mail,'"') from webform_submitted_data sd JOIN webform_submissions s ON s.sid = sd.sid JOIN users u ON u.uid = s.uid GROUP by s.sid LIMIT 1 INTO OUTFILE '/Users/nandersen/Downloads/users.csv' FIELDS TERMINATED BY '' ENCLOSED BY '' LINES TERMINATED BY '\n';
You need to group concatenate the webform data, which is multiple rows but one column with the user data which is in mutiple columns but one row. So by concatenating the data separately and grouping by the submission id, you can get the data you need.
Will output something like this:
"Nate","Andersen","nate#test.com","123 Atlanta Avenue","Nederland","Texas","12345","4095496504","safe_key4","safe_key6","safe_key2","09/07/1989", "69","oknate", "nate#test.com"
Somehow my corporate email address has found its way onto a spam/phish list. I suppose it's unavoidable, but I can't think of any time that I've sent an email to an external address and I'm very curious to know how it could have 'escaped'.
I would like to create a SELECT formula to find any mails where one or more recipients are external (ie. do not end with '#mycompany.com', '#mycompany.com>' or '/MYCOMPANY/COM'.
I've used '#Contains' in other queries, but #Contains and #Ends don't really do the job here. If they returned a count of the number of matches, then I could compare it to the total number of recipients. Any mails where these totals are unequal will be the ones I'm looking for. But they only return booleans.
I would do it like this (do NOT mix MYDOMINODOMAIN with /MYCOMPANY/COM):
_myDomains := #Lowercase("MYDOMINODOMAIN" : "mycompany.com" : "mycompany.net");
_mailRecipientString := #LowerCase(#ReplaceSubstring(SendTo : CopyTo : BlindCopyTo : Recipients; #Char(13) : #Char(9) : #Char(34) : #Char(39) : "," : "<" : ">" : "\"" : " " ; " "));
_mailRecipientValues := #Explode(#Implode(_mailRecipientString;" "); " "; #False);
_mailDomains := #Unique(#Trim(#Explode(#Implode(#Word(_mailRecipientValues; "#"; 2); " "); " "; #False)));
SELECT #Trim( #Replace( _mailDomains ; _myDomains ; "" ) ) != ""
What does this formula do?
Every address in SendTo, CopyTo, BlindcopyTo and (this is for paranoia as it contains the three former) Recipients ALWAYS has an #. I get the domains of this addresses using #Word.
Then I replace the "good" domains in this list with an empty string (#Lowercase to be sure). If the result is something different than the empty string -> Found one
Giving the string below with "server type" separated by comma:
string serverTypeList = "DB, IIS, CMDB";
//server.Type in the loop below should have value of "MDB"
My problem is that in this scenario it will return TRUE because "MDB" string is inside the serverTypeList.
I need it to return TRUE only if it matches a type of "MDB" and not "CMDB":
...
from site in SiteManager.Sites
from server in site.Servers
where
serverTypeList.Contains(server.Type)
select new Server()
{ ID=server.ID, SiteName=site.Name }
...
How can I change the code above?
Thank you
(", " + serverTypeList + ", ").Contains(", " + server.Type + ", ")
is one standard way to handle this. I'm not clear on the language you're using, so I don't know the exact syntax you would need, but the general idea is to ensure that the term appears between delimiters by forcing delimiters before and after the list string.