Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
We have been asked by a client to incorporate ICD-9 codes into a system.
I'm looking for a good resource to get a complete listing of codes and descriptions that will end up in a SQL database.
Unfortunately a web service is out of the question as a fair amount of the time folks will be off line using the application.
I've found http://icd9cm.chrisendres.com/ and http://www.icd9data.com/ but neither offer downloads/exports of the data that I could find.
I also found http://www.cms.hhs.gov/MinimumDataSets20/07_RAVENSoftware.asp which has a database of the ICD-9 codes but they are not in the correct format and I'm not 100% sure how to properly convert (It shows the code 5566 which is really 556.6 but I can't find a rule as to how/when to convert the code to include a decimal)
I'm tagging this with medical and data since I'm not 100% sure where it should really be tagged...any help there would also be appreciated.
Just wanted to chime in on how to correct the code decimal places. First, there are four broad points to consider:
Standard codes have Decimal place XXX.XX
Some Codes Do not have trailing decimal places
V Codes also follow the XXX.XX format --> V54.31
E Codes follow XXXX.X --> E850.9
Thus the general logic of how to fix the errors is
If first character = E:
If 5th character = '':
Ignore
Else replace XXXXX with XXXX.X
Else If 4th-5th Char is not '': (XXXX or XXXXX)
replace XXXXX with XXX + . + remainder (XXX.XX or XXX.X)
(All remaining are XXX)
I implemented this with two SQL Update statements:
Number 1, for Non E-codes:
USE MainDb;
UPDATE "dbo"."icd9cm_diagnosis_codes"
SET "DIAGNOSIS CODE" = SUBSTRING("DIAGNOSIS CODE",1,3)+'.'+SUBSTRING("DIAGNOSIS CODE",4,5)
FROM "dbo"."icd9cm_diagnosis_codes"
WHERE
SUBSTRING("DIAGNOSIS CODE",4,5) != ''
AND
LEFT("DIAGNOSIS CODE",1) != 'E'
Number 2 - For E Codes:
UPDATE "dbo"."icd9cm_diagnosis_codes"
SET "DIAGNOSIS CODE" = SUBSTRING("DIAGNOSIS CODE",1,4)+'.'+SUBSTRING("DIAGNOSIS CODE",5,5)
FROM "dbo"."icd9_Diagnosis_table"
WHERE
LEFT("DIAGNOSIS CODE",1) = 'E'
AND
SUBSTRING("DIAGNOSIS CODE",5,5) != ''
Seemed to do the trick for me (Using SQL Server 2008).
I ran into this same issue a while back and ended up building my own solution from scratch. Recently, I put up an open API for the codes for others to use: http://aqua.io/codes/icd9/documentation
You can just download all codes in JSON (http://api.aqua.io/codes/beta/icd9.json) or pull an individual code (http://api.aqua.io/codes/beta/icd9/250-1.json). Pulling a single code not only gives you the ICD-10 "crosswalk" (equivalents), but also some extra goodies, like relevant Wikipedia links.
I finally found the following:
"The field for the ICD-9-CM Principal and Other Diagnosis Codes is six characters in length, with the decimal point implied between the third and fourth digit for all diagnosis codes other than the V codes. The decimal is implied for V codes between the second and third digit."
So I was able to get a hold of a complete ICD-9 list and reformat as required.
You might find that the ICD-9 codes follow the following format:
All codes are 6 characters long
The decimal point comes between the 3rd and 4th characters
If the code starts with a V character the decimal point comes between the 2nd and 3rd characters
Check this out: http://en.wikipedia.org/wiki/List_of_ICD-9_codes
I struggled with this issue myself for a long time as well. The best resource I have been able to find for these are the zip files here:
https://www.cms.gov/ICD9ProviderDiagnosticCodes/06_codes.asp
It's unfortunate because they (oddly) are missing the decimal places, but as several other posters have pointed out, adding them is fairly easy since the rules are known. I was able to use a regular expression based "find and replace" in my text editor to add them. One thing to watch out for if you go that route is that you can end up with codes that have a trailing "." but no zero after it. That's not valid, so you might need to go through and do another find/replace to clean those up.
The annoying thing about the data files in the link above is that there is no relationship to categories. Which you might need depending on your application. I ended up taking one of the RTF-based category files I found online and re-formatting it to get the ranges of each category. That was still doable in a text editor with some creative regular expressions.
I was able to use the helpful answers here an create a groovy script to decimalize the code and combine long and short descriptions into a tab separated list. In case this helps anyone, I'm including my code here:
import org.apache.log4j.BasicConfigurator
import org.apache.log4j.Level
import org.apache.log4j.Logger
import java.util.regex.Matcher
import java.util.regex.Pattern
Logger log = Logger.getRootLogger()
BasicConfigurator.configure();
Logger.getRootLogger().setLevel(Level.INFO);
Map shortDescMap = [:]
new File('CMS31_DESC_SHORT_DX.txt').eachLine {String l ->
int split = l.indexOf(' ')
String code = l[0..split].trim()
String desc = l[split+1..-1].trim()
shortDescMap.put(code, desc)
}
int shortLenCheck = 40 // arbitrary lengths, but provide some sanity checking...
int longLenCheck = 300
File longDescFile = new File('CMS31_DESC_LONG_DX.txt')
Map cmsRows = [:]
Pattern p = Pattern.compile(/^(\w*)\s+(.*)$/)
new File('parsedICD9.csv').withWriter { out ->
out.write('ICD9 Code\tShort Description\tLong Description\n')
longDescFile.eachLine {String row ->
Matcher m = row =~ p
if (m.matches()) {
String code = m.group(1)
String shortDescription = shortDescMap.get(code)
String longDescription = m.group(2)
if(shortDescription.size() > shortLenCheck){
log.info("Not short? $shortDescription")
}
if(longDescription.size() > longLenCheck){
log.info("${longDescription.size()} == Too long? $longDescription")
}
log.debug("Match 1:${code} -- 2:${longDescription} -- orig:$row")
if (code.startsWith('V')) {
if (code.size() > 3) {
code = code[0..2] + '.' + code[3..-1]
}
log.info("Code: $code")
} else if (code.startsWith('E')) {
if (code.size() > 4) {
code = code[0..3] + '.' + code[4..-1]
}
log.info("Code: $code")
} else if (code.size() > 3) {
code = code[0..2] + '.' + code[3..-1]
}
if (code) {
cmsRows.put(code, ['longDesc': longDescription])
}
out.write("$code\t$shortDescription\t$longDescription\n")
} else {
log.warn "No match for row: $row"
}
}
}
I hope this helps someone.
Sean
Related
I have had to look up hundreds (if not thousands) of free-text answers on google, making notes in Excel along the way and inserting SAS-code around the answers as a last step.
The output looks like this:
This output contains an unnecessary number of blank spaces, which seems to confuse SAS's search to the point where the observations can't be properly located.
It works if I manually erase superflous spaces, but that will probably take hours. Is there an automated fix for this, either in SAS or in excel?
I tried using the STRIP-function, to no avail:
else if R_res_ort_txt=strip(" arild ") and R_kom_lan=strip(" skåne ") then R_kommun=strip(" Höganäs " );
If you want to generate a string like:
if R_res_ort_txt="arild" and R_kom_lan="skåne" then R_kommun="Höganäs";
from three variables, let's call them A B C, then just use code like:
string=catx(' ','if R_res_ort_txt=',quote(trim(A))
,'and R_kom_lan=',quote(trim(B))
,'then R_kommun=',quote(trim(C)),';') ;
Or if you are just writing that string to a file just use this PUT statement syntax.
put 'if R_res_ort_txt=' A :$quote. 'and R_kom_lan=' B :$quote.
'then R_kommun=' C :$quote. ';' ;
A saner solution would be to continue using the free-text answers as data and perform your matching criteria for transformations with a left join.
proc import out=answers datafile='my-free-text-answers.xlsx';
data have;
attrib R_res_ort_txt R_kom_lan length=$100;
input R_res_ort_txt ...;
datalines4;
... whatever all those transforms will be performed on...
;;;;
proc sql;
create table want as
select
have.* ,
answers.R_kommun_answer as R_kommun
from
have
left join
answers
on
have.R_res_ort_txt = answers.res_ort_answer
& have.R_kom_lan = abswers.kom_lan_answer
;
I solved this by adding quotes in excel using the flash fill function:
https://www.youtube.com/watch?v=nE65QeDoepc
I have this kind of df:
df = pd.DataFrame({"text_column" : ['question: everybody is kongfu fighting', 'panda: of course', 'question: Why is the world so great ?', 'friend: Everybody is smart', 'and everybody is cool', 'enemy: no that is just not true', 'jordan: i want to add one thing: please', 'do not talk about this.', ' 2nd question : are you sure ?', 'yeah sure' ]})
text_column
0 question: everybody is kongfu fighting
1 panda: of course
2 question: Why is the world so great ?
3 friend: Everybody is smart
4 and everybody is cool
5 enemy: no that is just not true
6 jordan: i want to add one thing: please
7 do not talk about this.
8 2nd question : are you sure ?
9 messi: yeah sure
10 question: you are sure about this ?
11 donald: youre questions are stupid!
I want the following output
type_column new_text_column
0 question: panda: everybody is kongfu fighting of course
1 question: friend: enemy: jordan: 2nd question : messi: Why is the world so great ? Everybody is smart and everybody is cool no that is just not true i want to add one thing: please do not talk about this. are you sure ? yeah sure
2 question: donald: youre questions are stupid!
Basically each question and answer (topic) have to be in one cell.
I could write a function that works but would use apply, which is in general not an optimal solution.
Does anybody have a good idea how to do it?
Define the following functions:
"Specialized" split of the source text field into 2 parts:
def mySplit(txt):
tbl = re.split(': ?', txt, 1)
if len(tbl) == 1:
tbl.insert(0, '')
return pd.Series(tbl, index=['Qn', 'Ans'])
Reformat a group of rows:
def reformat(grp):
t1 = ': '.join(grp.Qn.tolist()) + ':'
t2 = ' '.join(grp.Ans.tolist())
return pd.Series([t1, t2], index=['type_column', 'new_text_column'])
Then, to get the result run:
df.text_column.apply(mySplit)\
.groupby(df2.Qn.str.startswith('question').cumsum())\
.apply(reformat).reset_index(drop=True)
It performs:
Specialized split of text_column into 2 columns (Qn and Ans).
Cut into groups starting on each row with Qn starting with question.
Apply reformat to each group.
Reset the index (discarding the old index).
It's hard to tell from the example what the criteria is for separating.
I'm guessing it's splitting on the colon, so you could try list comprehension
df["type_column"] = [x.split(":")[0] for x in df["text_column"]]
df["new_text_column"] = [x.split(":")[1] for x in df["text_column"]]
I am fairly new to Puppet and Ruby. Most likely this question has been asked before but I am not able to find any relevant information.
In my puppet code I will have a string variable retrieved from the fact hostname.
$n="$facts['hostname'].ex-ample.com"
I am expecting to get the values like these
DEV-123456-02B.ex-ample.com,
SCC-123456-02A.ex-ample.com,
DEV-123456-03B.ex-ample.com,
SCC-999999-04A.ex-ample.com
I want to perform the following action. Change the string to lowercase and then replace the
-02, -03 or -04 to -01.
So my output would be like
dev-123456-01b.ex-ample.com,
scc-123456-01a.ex-ample.com,
dev-123456-01b.ex-ample.com,
scc-999999-01a.ex-ample.com
I figured I would need to use .downcase on $n to make everything lowercase. But I am not sure how to replace the digits. I was thinking of .gsub or split but not sure how. I would prefer to make this happen in a oneline code.
If you really want a one-liner, you could run this against each string:
str
.downcase
.split('-')
.map
.with_index { |substr, i| i == 2 ? substr.gsub(/0[0-9]/, '01') : substr }
.join('-')
Without knowing what format your input list is taking, I'm not sure how to advise on how to iterate through it, but maybe you have that covered already. Hope it helps.
Note that Puppet and Ruby are entirely different languages and the other answers are for Ruby and won't work in Puppet.
What you need is:
$h = downcase(regsubst($facts['hostname'], '..(.)$', '01\1'))
$n = "${h}.ex-ample.com"
notice($n)
Note:
The downcase and regsubst functions come from stdlib.
I do a regex search and replace using the regsubst function and replace ..(.)$ - 2 characters followed by another one that I capture at the end of the string and replace that with 01 and the captured string.
All of that is then downcased.
If the -01--04 part is always on the same string index you could use that to replace the content.
original = 'DEV-123456-02B.ex-ample.com'
# 11 -^
string = original.downcase # creates a new downcased string
string[11, 2] = '01' # replace from index 11, 2 characters
string #=> "dev-123456-01b.ex-ample.com"
My flutter apps displays description strings which are fetched from a third party, thus these descriptions may already be fitted with ellipsis as to lead the story onwards but the api only fetches around 7 lines or so of the description.
I want every description to contain an ellipsis but don't want to add ellipsis to strings that already contain them.
I want to turn this
scan the face of each account holder as
into this
scan the face of each account holder as...
but if the original already contains this ellipsis, it should be skipped.
String input = 'your string';
if (!input.endsWith('...')) {
input += '...';
}
Key here is String.endsWith(), it's the easiest way to know if the content already ends with an ellipsis.
Enhanced answered from #greyaurora
be careful trailing space, e.g. my data...
be careful with different ways to display ellipsis, e.g. ... (3 dots ) vs … (1-char ellipsis)
void main() {
String input = 'your string';
String trimmed = input.trim();
if (!trimmed.endsWith('...') && !trimmed.endsWith('…')) {
trimmed += '…';
}
print('hello ${trimmed}');
}
I have seen some questions (like this one) here asking about if a cell in Excel can be formatted by NPOI/POI as if formatted by Excel. As most of you, I have to deal with issues with Currency and DateTime. Here let me ask how the formatting can be achieved as if it has been formatted by Excel? (I will answer this question myself as to demonstrate how to do it.)
Setting: Windows 10, English, Region: Taiwan
Excel format: XLSX (version 2007 and later)
(Sorry about various edit of this question as I have pressed the 'Enter' button at unexpected time.)
If you format a cell as Currency, you have 4 choices:
The internal format of each style is as follow:
-NT$1,234.10
<numFmt formatCode=""NT$"#,##0.00" numFmtId="164"/>
[RED]NT$1,234.10
<numFmt formatCode=""NT$"#,##0.00;[Red]"NT$"#,##0.00" numFmtId="164"/>
-NT$1,234.10
<numFmt formatCode=""NT$"#,##0.00_);("NT$"#,##0.00)" numFmtId="7"/>
[RED]-NT$1,234.10
<numFmt formatCode=""NT$"#,##0.00_);[Red]("NT$"#,##0.00)" numFmtId="8"/>
Note: There is a pair of double quote (") comes before and after NT$.
(To get internal format of XLSX, just unzip it. The Style information is available in <unzip dir>\xl\Styles.xml Check out this answer if you need more information.)
(FYI: In formatCode, the '0' represent a digit. The '#' also represent a digit, but will not appear if the number is not large enough. So any number less than 1000 will not have the comma inside it. The '_' is a space holder. In format 3, '1.75' appears as 'NT$1.75 '. The last one is a space.)
(FYI: In numFmtId, for case 1 and case 2, number 164 is for user-defined. For case 3 and 4, number 7 and 8 are build-in style.)
For developers using POI/NPOI, you may find out if you format your currency column using Build In Format using 0x7 or 0x8, you can get only the third or fourth choice. You cannot get the first or second choice.
To get the first choice, you build upon style 0x7 "$#,##0.00);($#,##0.00)". You need to add the currency symbol and the pair of double quotes in front of it.
styleCurrency.DataFormat = workbook.CreateDataFormat().GetFormat("\"NT$\"#,##0.00");
Apply this format to a cell with number. Once you open the Excel result file, right click to check formatting, you will see the first choice.
Please feel free to comment on this post.
var cell5 = row.CreateCell(5, CellType.Numeric);
cell5.SetCellValue(item.OrderTotal);
var styleCurrency = workbook.CreateCellStyle();
styleCurrency.DataFormat= workbook.CreateDataFormat().GetFormat(string.Format("\"{0}\"#,##0.00", item.CurrencySymbol));//styleCurrency;
cell5.CellStyle = styleCurrency;
styleCurrency = null;
Iterate over loop for multiple currency.
Function to GetCurrencySymbol against currency Code on C#
private string GetCurencySymbol(string isOcurrencyCode)
{
return CultureInfo.GetCultures(CultureTypes.AllCultures).Where(c => !c.IsNeutralCulture)
.Select(culture =>
{
try
{
return new RegionInfo(culture.LCID);
}
catch
{
return null;
}
})
.Where(ri => ri != null && ri.ISOCurrencySymbol == isOcurrencyCode)
.Select(ri => ri.CurrencySymbol).FirstOrDefault();}