Extract specific data from field in Access table - string

Every 2 weeks I need to import an excel file into an access 2007 database. The 2nd cell in the excel file A2 contains always different information. It always start with AS OF PAY PERIOD XX, where XX stands for the pay period. When imported into an access table I need to extract the pay period and it seems that the pay period is always in position 18, a payperiod is always 2 chars in length. Is there an easy way with a string function to extract that information. Thanks.

http://office.microsoft.com/en-us/access-help/mid-function-HA001228881.aspx
Returns a Variant (String) containing a specified number of characters from a string.
Syntax
Mid(string, start [, length ] )
The Mid function syntax has these arguments :
string - Required. string expression from which characters are returned. If string contains Null, Null is returned.
start - Required. Long - Character position in string at which the part to be taken begins. If start is greater than the number of characters in string, Mid returns a zero-length string ("").
length - Optional. Variant (Long) - Number of characters to return. If omitted or if there are fewer than length characters in the text (including the character at start), all characters from the start position to the end of the string are returned.
Use the MID statment within a query, a SQL statement, or on the field data element from a recordset process.

Related

Format in Python

I have a list of values as follows:
no column
1. 111-222-11
2. 112-333-12
3. 113-444-13
I want to format the value from 111-222-11 to 111-222-011 and format the other values similarly. Here is my code snippet in Python 3, which I am trying to use for that:
‘{:03}-{:06}-{:03}.format(column)
I hope that you can help.
Assuming that column is a variable that can be assigned string values 111-222-11, 112-333-12, 113-444-13 and so on, which you want to change to 111-222-011, 112-333-012, 113-444-013 and so on, it appears that you tried to use a combination of slice notation and format method to achieve this.
Slice notation
Slice notation, when applied to a string, treats it as a list-like object consisting of characters. The positional index of a character from the beginning of the string starts from zero. The positional index of a character from the end of the string starts with -1. The first colon : separates the beginning and the end of a slice. The end of the slice is not included into it, unlike its beginning. You indicate slices as you would indicate indexes of items in a list by using square brackets:
'111-222-11'[0:8]
would return
'111-222-'
Usually, the indexes of the first and the last characters of the string are skipped and implied by the colon.
Knowing the exact position where you need to add a leading zero before the last two digits of a string assigned to column, you could do it just with slice notation:
column[:8] + '0' + column[-2:]
format method
The format method is a string formatting method. So, you want to use single quotes or double quotes around your strings to indicate them when applying that method to them:
'your output string here'.format('your input string here')
The numbers in the curly brackets are not slices. They are placeholders, where the strings, which are passed to the format method, are inserted. So, combining slices and format method, you could add a leading zero before the last two digits of a column string like this:
'{0}0{1}'.format(column[:8], column[-2:])
Making more slices is not necessary because there is only one place where you want to insert a character.
split method
An alternative to slicing would be using split method to split the string by a delimiter. The split method returns a list of strings. You need to prefix it with * operator to unpack the arguments from the list before passing them to the format method. Otherwise, the whole list will be passed to the first placeholder.
'{0}-{1}-0{2}'.format(*column.split('-'))
It splits the string into a list treating - as the separator and puts each item into a new string, which adds 0 character before the last one.

PowerQuery M Conditional Column 'count' argument is out of range

I have a sheet with dates as MMDDYYY with no leading 0's if month number is single digit. For example, 1012018 or 12312018. Each record has a date, and each date is either 7 or 8 characters in length.
Here is the code I am using to convert the numbers to dates:
if Text.Length([ContractDate]) = 7
then
Text.Range([ContractDate],0,1)&"/"&Text.Range([ContractDate],1,2)&"/"&Text.Range([ContractDate],4,4)
else
Text.Range([ContractDate],0,2)&"/"&Text.Range([ContractDate],2,2)&"/"&Text.Range([ContractDate],4,4)
The code works fine for the "else" condition but I am getting error "Expression.Error: The 'count' argument is out of range. Details: 4" for all records where Text.Length() = 7. I verified this by adding a second column to get Length of ContractDate.
What am I missing?
EDIT: Problem Solved - I'm an idiot. I was getting an error because in the "then" condition, I am extracting a substring of (4,4) from a value that only has Len=7. I can't get 4 characters out of a 7 character string when starting at index of 4.
I know you found the issue with your code, but worth pointing out some things that might be good to know.
Text.Range with no character count will pull in all characters past the start point (so Text.Range([ContractDate], 4) would work for both).
Text.Middle operates like Text.Range but will not cause an error if you select a range that expands past the size of the string. This can be useful if for some reason you were dealing with variable size strings where you need a specific number of characters up to a limit past a certain position.
You could also use Text.PadStart([ContractDate], 8, "0") to pad the 7 length strings with a 0 at the start, and avoid the need for a conditional check all together.

Excel : Find only Hexa decimals from 1 cell

I'm a newbie on Excel.
So I have a list of some names ending with Hexa decimals. And some names, that doesn't have any.
My mission is to see only those names with Hexa decimals. (Mabye somehow filter them out)
Column:
BFAXSPOINTDEVBAUHOFLAN2AD
BFAXSQLBAUHOFLAN207
BFAXSQLDEVBAUHOFLAN27A
BFREPDEVBAUHOFLAN258
BFREPORTINGBAUHOFLAN20B
COBALTSEA02900
COBALTSEAVHOST900
DIRECTO8000
DIRECTO9000
DIRECTODCDIRECTOLA009
DYNAMAEBSSISE006
SURVEYEBSSISE006
KVMSRV00",
KVMSRV01",
KVMSRV02",
ASR
CACTI
DBSYNC",
DTV
and so on...
The Function HEX2DEC will help you achieve what you want - it attempts to convert a number as a hexidecimal, into a decimal. If it is not a valid Hex input, it will produce an error.
The key is understanding how many digits you expect your decimal to be - is it the last 5 characters; the last 10; etc. Also note that there is a risk that random text / numbers will be seen as hexidecimal when really that's not what it represents [but that's a problem with the question as you have laid it out; going solely based on the text provided, all we can see is whether a particular cell creates a valid Hexidecimal].
The full formula would look like this[assuming your data starts in A1, and that your Hexidecimal numbers are expected to be 6 characters long, this goes in B1 and is copied down]:
=ISERROR(HEX2DEC(RIGHT(A1,6)))
This takes the 6 rightmost characters of a cell, and attempts to convert it from Hex to Decimal. If it fails, it will produce TRUE [because of ISERROR]; if it succeeds, it will produce FALSE.
Then simply filter on your column to see the subset of results you care about.
Consider the following UDF:
Public Function EndsInHex(r As Range) As Boolean
Dim s As String, CH As String
s = r(1).Text
CH = Right(s, 1)
If CH Like "[A-F]" Or CH Like "[0-9]" Then
EndsInHex = True
Else
EndsInHex = False
End If
End Function
For the string to end in a hex, the last character must be a hex.

I want to extract only last two numeric values from a string variable in SAS

I want to extract only last two numeric values from a string variable and assign it to a new variable. Firstly i have extracted all the numeric values from the string using the code below and assigned it to a new variable but i ultimately want to extract only the last two numeric values so is there any better way to do this.
UI_DUM = input(compress(Prod_Desc,,"kd"),best.);
And one more question is: how to assign a temp variable for doing some manupulation work in SAS?
Here is the code.
You are doing it right, to remove the characters and keeping only digits. The same is being done for variable "temp1"(in the below code).
In the second step, using the length function, to calculate the total length of the string which now contains only digits. In the third step using the substr function to extract the last two digits.
If you want to do it in one statement, "final" variable is the answer.
LENGTH Function - Returns the length of a non-blank character string, excluding
trailing blanks, and returns 1 for a blank character string
compress function with "kd" option - would keep only digits.
COMPRESS(<, chars><, modifiers>)
Modifier - specifies a character constant, variable, or expression in which each non-blank character modifies the action of the COMPRESS function. Blanks are ignored. The following characters can be used as modifiers.
d or D adds digits to the list of characters.
k or K keeps the characters in the list instead of removing them
substr function - Extracts a substring from an argument -
SUBSTR(string, position<,length>)
data _null_;
Test_string="ada13117a1w11da1286s";
temp1=compress(Test_string, , 'kd');
temp2=length(temp1);
temp3=substr(temp1,temp2-1,2);
final=substr(compress(Test_string, , 'kd'),length(compress(temp1))-1,2);
put _all_;
run;
Regarding the temp variable, there is no such one in SAS. Just use any variable name and use the drop statement in final dataset like below;
data test(drop = temp); /*Would work as the temp variable*/
temp= 2*balance;/*just for example*/
/*use the temp in further calculations*/
run;
A somewhat different take:
data want;
set have;
UI_DUM = input(compress(Prod_Desc,,"kd"),best.);
UI_DUM_last2 = mod(UI_DUM,100);
run;
You could do that all in one line of course as well. This uses the numeric modulo function to simply give you the last 2 digits (any number modulo 100 will return the final 2 digits).

Excel 2007 - Generate unique ID based on text?

I have a sheet with a list of names in Column B and an ID column in A. I was wondering if there is some kind of formula that can take the value in column B of that row and generate some kind of ID based on the text? Each name is also unique and is never repeated in any way.
It would be best if I didn't have to use VBA really. But if I have to, so be it.
Solution Without VBA.
Logic based on First 8 characters + number of character in a cell.
= CODE(cell) which returns Code number for first letter
= CODE(MID(cell,2,1)) returns Code number for second letter
= IFERROR(CODE(MID(cell,9,1)) If 9th character does not exist then return 0
= LEN(cell) number of character in a cell
Concatenating firs 8 codes + adding length of character on the end
If 8 character is not enough, then replicate additional codes for next characters in a string.
Final function:
=CODE(B2)&IFERROR(CODE(MID(B2,2,1)),0)&IFERROR(CODE(MID(B2,3,1)),0)&IFERROR(CODE(MID(B2,4,1)),0)&IFERROR(CODE(MID(B2,5,1)),0)&IFERROR(CODE(MID(B2,6,1)),0)&IFERROR(CODE(MID(B2,7,1)),0)&IFERROR(CODE(MID(B2,8,1)),0)&LEN(B2)
Sorry, I didn't found a solution with formula only even if this thread might help (trying to calculate the points in a scrabble game) but I didn't find a way to be sure the generated hash would be unique.
Yet, here is my solution, based on a UDF (Used-Defined Function):
Put the code in a module:
Public Function genId(ByVal sName As String) As Long
'Function to create a unique hash by summing the ascii value of each character of a given string
Dim sLetter As String
Dim i As Integer
For i = 1 To Len(sName)
genId = Asc(Mid(sName, i, 1)) * i + genId
Next i
End Function
And call it in your worksheet like a formula:
=genId(A1)
[EDIT] Added the * i to take into account the order. It works on my unit tests
May be OTT for your needs, but you can use a call to CoCreateGuid to get a real GUID
Private Declare Function CoCreateGuid Lib "ole32" (ID As Any) As Long
Function GUID() As String
Dim ID(0 To 15) As Byte
Dim i As Long
If CoCreateGuid(ID(0)) = 0 Then
For i = 0 To 15
GUID = GUID & Format(Hex$(ID(i)), "00")
Next
Else
GUID = "Error while creating GUID!"
End If
End Function
Test using
Sub testGUID()
MsgBox GUID
End Sub
How to best implement depends on your needs. One way would be to write a macro to get a GUID populate a column where names exist. (note, using it as a udf as is is no good, since it will return a new GUID when recalculated)
EDIT
See this answer for creating a SHA1 hash of a string
Do you just want an incrementing numeric id column to sit next to your values? If so, and if your values will always be unique, you can very easily do this with formulae.
If your values were in column B, starting in B2 underneath your headers for example, in A2 you would type the formula "=IF(B2="","",1+MAX(A$1:A1))". You can copy and paste that down as far as your data extends, and it will increment a numeric identifier for each row in column B which isn't blank.
If you need to do anything more complicated, like identify and re-identify repeating values, or make identifiers 'freeze' once they're populated, let me know. Currently, when you clear or add values to your list the identifers will toggle themselves up and down, so you need to be careful if your data changes.
Unique identifier based on the number of specific characters in text. I used an identifier based on vowels and numbers.
=LEN($J$14)-LEN(SUBSTITUTE($J$14;"a";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"e";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"i";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"j";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"o";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"u";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"y";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"1";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"2";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"3";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"4";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"5";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"6";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"7";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"8";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"9";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"0";""))
You say you are confident that there are no duplicate values in your words. To push it further, are you confident that the first 8 characters in any word would be unique?
If so, you can use the below formula. It works by individually taking each character's ASCII code - 40 [assuming normal characters, this puts numbers at between 8 & 57, and letters at between 57 & 122], and multiplying that characters code by 10 ^ [that character's digit placement in the word]. Basically it takes that character code [-40], and concatenates each code onto the next.
EDIT Note that this code no longer requires that at least 8 characters exist in your word to prevent an error, as the actual word to be coded has 8 "0"'s appended to it.
=TEXT(SUM((CODE(MID(LOWER(RIGHT(REPT("0",8)&A3,8)),{1,2,3,4,5,6,7,8},1))-40)*10^{0,2,4,6,8,10,12,14}),"#")
Note that as this uses the ASCII values of the characters, the ID # could be used to identify the name directly - this does not really create anonymity, it just turns 8 unique characters into a unique number. It is obfuscated with the -40, but not really 'safe' in that sense. The -40 is just to get normal letters and numbers in the 2 digit range, so that multiplying by 10^0,2,4 etc. will create a 2 digit unique add-on to the created code.
EDIT FOR ALTERNATIVE METHOD
I had previously attempted to do this so that it would look at each letter of the alphabet, count the number of times it appears in the word, and then multiply that by 10*[that letter's position in the alphabet]. The problem with doing this (see comment below for formula) is that it required a number of 10^26-1, which is beyond Excel's floating point precision. However, I have a modified version of that method:
By limiting the number of allowed characters in the alphabet, we can get the max total size possible to 10^15-1, which Excel can properly calculate. The formula looks like this:
=RIGHT(REPT("0",15)&TEXT(SUM(LEN(A3)*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}-LEN(SUBSTITUTE(A3,MID(Alphabet,{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15},1),""))*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}),"#"),15)
[The RIGHT("00000000000000"... portion of the formula is meant to keep all codes the same number of characters]
Note that here, Alphabet is a named string which holds the characters: "abcdehilmnorstu". For example, using the above formula, the word "asdf" counts the instances of a, s, and d, but not 'f' which isn't in my contracted alphabet. The code of "asdf" would be:
001000000001001
This only works with the following assumptions:
The letters not listed (nor numbers / special characters) are not required to make each name unique. For example, asdf & asd would have the same code in the above method.
And,
The order of the letters is not required to make each name unique. For example, asd & dsa would have the same code in the above method.

Resources