Haskell, HDBC.Sqlite3 - How to add a column if it doesn't exist already? - haskell

I have a function that given an Int returns a list of lists of Strings.
fetchParts :: Int -> [[String]]
This is what the output looks like
[["title", "some title"], ["rate", "2.4"], ["dist", "some string"], ["tr", "1"], ["td, "2"] ..]]
The length of the output can be variable. Only the first 3 lists can be present 100% of the time.
The later part of the list can be
["a", "1"], ["b", "2"] ..
or
["some", "1"], ["part", "2"], ["of", "3"] ..]
or
["ex1", "a"], ["ex2", "b"], ..]
or some other combination of strings.
And I want to add this output to a sqlite3 database file. I'm using HDBC and HDBC.Sqlite3 for this.
To add something to a database file I'm running functions like these
initialConnection <- connectSqlite3 "src/parts.db"
run initialConnection partsEntry []
commit initialConnection
disconnect initialConnection
where partsEntry is a simple SQL String like this
partsEntry = "INSERT INTO PARTSDATA ( title, rate, dist, ...) VALUES ( "some title", "2.4", "some string", ...)
where
( title, rate, dist, ...) are from head <$> fetchParts 1
and
("some title", "2.4", "some string" ...) are from last <$> fetchParts 1
The problem is say if "some" column doesn't exists, code will throw errors.
What I want to do is something like this
if column "abc" doesn't exists, add column "abc" and insert
"this" value at the current row
if column "abc" exists, just insert "this" value at the current row
But I'm not sure how to go about doing that.

I was able to solve the problem.
First use describeTable function from HDBC package. The function will return column names and type. If you just need the names like I did, this is what you can do
getColumnsInTable :: conn -> String -> IO [String]
getColumnsInTable conn tableName = do
d <- describeTable conn tableName
return $ fst <$> d
The return will have all the columns' names.
Scan through the list to see if it contains all the columns you wish. If it doesn't use a function like the following to alter the table, i.e. add a new column with INT type.
createNewColumn conn columnName = do
let stmt = "ALTER TABLE FantasyBooks ADD COLUMN " ++ columnName ++ " INT;"
run conn stmt []

Related

Variable Not in Scope from Tuple

I have a function that looks like the following:
format input =
let
input (name,_,_,_) = name
n = 15 - length name
The purpose of this is to get a value from a tuple and store that value (the name) into a variable called name. Then I make a new variable called n which subtracts the length of that string from a number. When I compile this, I get an error saying that "name" is out of scope on the n = ... line.
Variable not in scope: name :: t1 a2
Not exactly sure where to go from here or what I might need to change.
The variable input goes on the right-hand side; the pattern (name, _, _, _) by itself goes on the left.
format input =
let (name, _, _, _) = input
n = 15 - length name
in ...

matching words/phrases in scala on as is basis

I have to find if a given phrase/word exists in a paragraph or not. Here's what I have done, given "wordlist" is the paragraph in which I have to look for phrases/words and "words" is the phrase/word.
if (wordlist contains words){println(words)}
But this also does substring search as:
"value of this" contains "val" is true. I want "true" in only those cases where the phrase/word is present as is and is not a part of other string in "wordlist". So,"value of this"contains "x" should give true for the following values of x:
"value", "value of", "this" etc and give false for "val", "alue", "e of" "his" etc. Any help would be appreciated.
I believe to make it faster you need to build an index (although initial cost will be high since you need to build the index, but then matching process will be much faster). Otherwise you will have to traverse all the possibilities, which will be slow.
I'll use "value of this" as an example. An idea would be, building a Map (sorted) based number of chars for all combination of phrases.
value would be Map(Map(a -> 1, e -> 1, l -> 1, u -> 1, v -> 1) -> List(value)).
value of would be Map(Map( -> 1, a -> 1, e -> 1, f -> 1, l -> 1, o -> 1, u -> 1, v -> 1) -> List(value of))
and so on.
Then, when trying to check whether a phrase/word exist, you can just match according to the frequency of characters. You'll then get a List which you have to check again.
This is a bit like trying to find a sublist in a list, so one approach would be to convert both into lists of words, as follows:
wordlist.split(" ") containsSlice words.split(" ")
From the REPL, it looks like this meets your requirements (if not please expand!):
scala> def hasPhrase(wordList:String,words:String) = wordList.split(" ") containsSlice words.split(" ")
hasPhrase: (wordList: String, words: String)Boolean
scala> hasPhrase("value of this","value")
res13: Boolean = true
scala> hasPhrase("value of this","value of")
res14: Boolean = true
scala> hasPhrase("value of this","val")
res15: Boolean = false
scala> hasPhrase("value of this","his")
res16: Boolean = false
Splitting both strings is not going to be efficient across big strings or a large number of strings. If your use case allows, you could split the long phrase just once (so that you can do wordlistAsCollection containsSlice words.split(" ")). You could also try a regex approach as suggested in the comments, perhaps along the lines of:
def hasPhrase(wordList:String,words:String) =
new scala.util.matching.Regex("\\b"+words+"\\b")
.findFirstMatchIn(wordList)
.isDefined

Search for string on each list item

I need to search in the List for the its item that have this string: sp1_inicio.pbf.
The list is in this format:
["X,sp1_inicio.pbf,2,AB5E","X,sp1_chile.pbf,3,4F46"]
The application name is the second element sp1_inicio.pbf and the version is the third element, 2.
The application name is unique, no matter how big the list is, it won't repeat.
So based on the first string I need to search inside this list for the correct application and get its version number.
These data is returned from Riak, in the code bellow I'm showing only the method that I've created to deal with that situation.
Here is my code where I get the list and the file name (don't expect much, it's my first code):
get_application_version(LogicalNumber, Acronym, Object) ->
{_, Where, Name, _, _, _, _} = Object,
{ok, Client} = riak:local_client(),
% Try to get the logical number from the bucket terminals
case Client:get(<<"terminals">>, LogicalNumber) of
% If the logical number is returned, its value goes to the Terminal Variable
{ok, Terminal} ->
% Returns the Terminal value, in this case it is a json: group: xxx and decode it
{struct, TerminalValues} = mochijson2:decode(riak_object:get_value(Terminal)),
% Use proplist to get the value of the decoded json key 'groups'
% Here we already have the group ID of the current logical number in the variable GroupID
GroupId = proplists:get_value(<<"group">>, TerminalValues),
% Acronym with _
Acronym_ = string:concat(binary_to_list(Acronym), "_"),
% Group with acronym ex.: ab1_123
GroupName = string:concat(Acronym_, binary_to_list(GroupId)),
case Client:get(<<"groups">>, list_to_binary(GroupName)) of
{ok, Group} ->
{struct, GroupValues} = mochijson2:decode(riak_object:get_value(Group)),
AppsList = proplists:get_value(<<"apps_list">>, GroupValues);
%%% Right here I have all the data required to make the list search
%%% The list is inside AppsList
%%% The application name is inside Name
{error, notfound} ->
io:format("Group notfound")
end;
{error, notfound} ->
io:format("Terminal notfound")
end.
I don't know if creating a list with string is the best way of doing this or even if this is the fasted approach and that worries me.
You can use for example code like this:
find_app(Name, AppsList) ->
F = fun(X) ->
case string:tokens(X, ",") of
[_, Name, Version|_] -> {ok, Version};
_ -> next
end
end,
find_first(F, AppsList).
bin_find_app(Name, AppsList) ->
F = fun(X) ->
case binary:split(X, <<$,>>, [global]) of
[_, Name, Version|_] -> {ok, Version};
_ -> next
end
end,
find_first(F, AppsList).
find_first(_, []) -> not_found;
find_first(F, [X|L]) ->
case F(X) of
next -> find_first(F, L);
Result -> Result
end.
Example of usage:
1> c(search_for).
{ok,search_for}
2> L = ["X,sp1_inicio.pbf,2,AB5E","X,sp1_chile.pbf,3,4F46"].
["X,sp1_inicio.pbf,2,AB5E","X,sp1_chile.pbf,3,4F46"]
3> Name = "sp1_inicio.pbf".
"sp1_inicio.pbf"
4> search_for:find_app(Name, L).
{ok,"2"}
5> search_for:bin_find_app(list_to_binary(Name), [list_to_binary(X) || X <- L]).
{ok,<<"2">>}
Edit: You can work with binary as well.

HDBC and multiple resultsets in a single statement: only first resultset returned

I'm looking for a way for HDBC to support multiple resultsets in a single statement
testMultipleResultsetSingleStatement = do
let sql = "select 1,2,3 union all select 2,3,4 select 'a', 'b'"
c <- connectODBC connectionString
rs <- quickQuery c sql []
return rs
this will only return [[SqlInt32 1,SqlInt32 2,SqlInt32 3],[SqlInt32 2,SqlInt32 3,SqlInt32 4]]
We see here that the results from second resultset are discarded
I'm wondering if there is another function than quickQuery that would support this?
Ideally, the return type would be [[[SqlValue]]] instead of [[SqlValue]] so the first outermost list would correspond to each result set returned by the query.
If HDBC doesn't provide a way for it, what other package would handle statements which returns multiple resultsets?
edit: Actually, a solution without API change would be to make it work this way:
testMultipleResultsetSingleStatement = do
let
sql = "select 1,2,3 union all select 2,3,4 select 'a', 'b'"
c <- connectODBC connectionString
statement <- prepare c sql
_ <- execute statement []
rows1 <- fetchAllRows statement
rows2 <- fetchAllRows statement
return (rows1, rows2)
I checked and in the case of sqlserver it did return an empty list for rows2
No, it is not supported currently in hdbc.
Though i see no purpose for this kind of feature.
How would it be better than
let sql1 = "select 1,2,3 union all select 2,3,4"
sql2 = " select 'a', 'b'"
c <- connectODBC connectionString
rs1 <- quickQuery c sql1 []
rs2 <- quickQuery c sql2 []
return (rs1,rs2)
or if you really insist on having dirrerent records with different number of fileds and types of fields in one list (hmm weird but ok) you could do this:
return $ rs1 ++ rs2

Insert a character at a specific location in a string

I would like to insert an extra character (or a new string) at a specific location in a string. For example, I want to insert d at the fourth location in abcefg to get abcdefg.
Now I am using:
old <- "abcefg"
n <- 4
paste(substr(old, 1, n-1), "d", substr(old, n, nchar(old)), sep = "")
I could write a one-line simple function for this task, but I am just curious if there is an existing function for that.
You can do this with regular expressions and gsub.
gsub('^([a-z]{3})([a-z]+)$', '\\1d\\2', old)
# [1] "abcdefg"
If you want to do this dynamically, you can create the expressions using paste:
letter <- 'd'
lhs <- paste0('^([a-z]{', n-1, '})([a-z]+)$')
rhs <- paste0('\\1', letter, '\\2')
gsub(lhs, rhs, old)
# [1] "abcdefg"
as per DWin's comment,you may want this to be more general.
gsub('^(.{3})(.*)$', '\\1d\\2', old)
This way any three characters will match rather than only lower case. DWin also suggests using sub instead of gsub. This way you don't have to worry about the ^ as much since sub will only match the first instance. But I like to be explicit in regular expressions and only move to more general ones as I understand them and find a need for more generality.
as Greg Snow noted, you can use another form of regular expression that looks behind matches:
sub( '(?<=.{3})', 'd', old, perl=TRUE )
and could also build my dynamic gsub above using sprintf rather than paste0:
lhs <- sprintf('^([a-z]{%d})([a-z]+)$', n-1)
or for his sub regular expression:
lhs <- sprintf('(?<=.{%d})',n-1)
stringi package for the rescue once again! The most simple and elegant solution among presented ones.
stri_sub function allows you to extract parts of the string and substitute parts of it like this:
x <- "abcde"
stri_sub(x, 1, 3) # from first to third character
# [1] "abc"
stri_sub(x, 1, 3) <- 1 # substitute from first to third character
x
# [1] "1de"
But if you do this:
x <- "abcde"
stri_sub(x, 3, 2) # from 3 to 2 so... zero ?
# [1] ""
stri_sub(x, 3, 2) <- 1 # substitute from 3 to 2 ... hmm
x
# [1] "ab1cde"
then no characters are removed but new one are inserted. Isn't that cool? :)
#Justin's answer is the way I'd actually approach this because of its flexibility, but this could also be a fun approach.
You can treat the string as "fixed width format" and specify where you want to insert your character:
paste(read.fwf(textConnection(old),
c(4, nchar(old)), as.is = TRUE),
collapse = "d")
Particularly nice is the output when using sapply, since you get to see the original string as the "name".
newold <- c("some", "random", "words", "strung", "together")
sapply(newold, function(x) paste(read.fwf(textConnection(x),
c(4, nchar(x)), as.is = TRUE),
collapse = "-WEE-"))
# some random words strung together
# "some-WEE-NA" "rand-WEE-om" "word-WEE-s" "stru-WEE-ng" "toge-WEE-ther"
Your original way of doing this (i.e. splitting the string at an index and pasting in the inserted text) could be made into a generic function like so:
split_str_by_index <- function(target, index) {
index <- sort(index)
substr(rep(target, length(index) + 1),
start = c(1, index),
stop = c(index -1, nchar(target)))
}
#Taken from https://stat.ethz.ch/pipermail/r-help/2006-March/101023.html
interleave <- function(v1,v2)
{
ord1 <- 2*(1:length(v1))-1
ord2 <- 2*(1:length(v2))
c(v1,v2)[order(c(ord1,ord2))]
}
insert_str <- function(target, insert, index) {
insert <- insert[order(index)]
index <- sort(index)
paste(interleave(split_str_by_index(target, index), insert), collapse="")
}
Example usage:
> insert_str("1234567890", c("a", "b", "c"), c(5, 9, 3))
[1] "12c34a5678b90"
This allows you to insert a vector of characters at the locations given by a vector of indexes. The split_str_by_index and interleave functions are also useful on their own.
Edit:
I revised the code to allow for indexes in any order. Before, indexes needed to be in ascending order.
I've made a custom function called substr1 to deal with extracting, replacing and inserting chars in a string. Run these codes at the start of every session. Feel free to try it out and let me know if it needs to be improved.
# extraction
substr1 <- function(x,y) {
z <- sapply(strsplit(as.character(x),''),function(w) paste(na.omit(w[y]),collapse=''))
dim(z) <- dim(x)
return(z) }
# substitution + insertion
`substr1<-` <- function(x,y,value) {
names(y) <- c(value,rep('',length(y)-length(value)))
z <- sapply(strsplit(as.character(x),''),function(w) {
v <- seq(w)
names(v) <- w
paste(names(sort(c(y,v[setdiff(v,y)]))),collapse='') })
dim(z) <- dim(x)
return(z) }
# demonstration
abc <- 'abc'
substr1(abc,1)
# "a"
substr1(abc,c(1,3))
# "ac"
substr1(abc,-1)
# "bc"
substr1(abc,1) <- 'A'
# "Abc"
substr1(abc,1.5) <- 'A'
# "aAbc"
substr1(abc,c(0.5,2,3)) <- c('A','B')
# "AaB"
It took me some time to understand the regular expression, afterwards I found my way with the numbers I had
The end result was
old <- "89580000"
gsub('^([0-9]{5})([0-9]+)$', '\\1-\\2', old)
similar to yours!
First make sure to load tidyverse package, and then use both paste0 and gsub.
Here is the exact code:
paste0(substr(old, 1,3), "d", substr(old,4,6))
In base you can use regmatches to insert a character at a specific location in a string.
old <- "abcefg"
n <- 4
regmatches(old, `attr<-`(n, "match.length", 0)) <- "d"
old
#[1] "abcdefg"
This could also be used with a regex to find the location to insert.
s <- "abcefg"
regmatches(s, regexpr("(?<=c)", s, perl=TRUE)) <- "d"
s
#[1] "abcdefg"
And works also for multiple matches with individual repacements at different matches.
s <- "abcefg abcefg"
regmatches(s, gregexpr("(?<=c)", s, perl=TRUE)) <- list(1:2)
s
#[1] "abc1efg abc2efg"

Resources