I try to generate random the language for person documents for testing in LotusScript:
Dim arr_language(0 To 10) As String
arr_language(0) = "English"
arr_language(1) = "Spanish"
arr_language(2) = "Chinese"
arr_language(3) = "German"
arr_language(4) = "Dutch"
arr_language(5) = "Swedish"
arr_language(6) = "French"
arr_language(7) = "Danish"
arr_language(8) = "Italian"
arr_language(9) = "Polish"
arr_language(10) = "Portugese"
Dim language As String
language = arr_language( Round(Rnd()* UBound(arr_language) ,0) )
I notice sometimes 'language' is sometimes empty. What am I doing wrong?
First of all: Your code is ok although it does not produce all languages with the same probability: For 100 Users the languages English and Portuguese will appear about 4-5 times whereas the other languages will appear 10-11 times.
If you change your code to
language = arr_language( Int(Rnd() * (UBound(arr_language) + 1)) )
Then all values will have about the same probability.
BUT: Nonetheless your code will always produce a number between 0 and 10 and therefor the variable language will be set.
To really answer your question you need to provide that part of the code that produces the "empty" output for language. I guess -same as Rob does- that you take that value and put it in a field an a document and there it is probably removed by a ComputeWithForm or any other means of recalulating.
e.g. Your "Language" field might be a DialogList that does not allow new values and one or more of your array- values is not a valid entry for that field and therefor is removed (on Compute or on open)...
If I could quess I'd say it is the value "Portugese" as in english the right spelling is "Portuguese"...
Related
I am currently new with NLP and need guidance as of how I can solve this problem.
I am currently doing a filtering technique where I need to brand data in a database as either being correct or incorrect. I am given a structured data set, with columns and rows.
However, the filtering conditions are given to me in a text file.
An example filtering text file could be the following:
Values in the column ID which are bigger than 99
Values in the column Cash which are smaller than 10000
Values in the column EndDate that are smaller than values in StartDate
Values in the column Name that contain numeric characters
Any value that follows those conditions should be branded as bad.
However, I want to extract those conditions and append them to the program that I've made so far.
For instance, for the conditions above, I would like to produce
`if ID>99`
`if Cash<10000`
`if EndDate < StartDate`
`if Name LIKE %[1-9]%`
How can I achieve the above result using the Stanford NLP? (or any other NLP library).
This doesn't look like a machine learning problem; it's a simple parser. You have a simple syntax, from which you can easily extract the salient features:
column name
relationship
target value or target column
The resulting "action rule" is simply removing the "syntactic sugar" words and converting the relationship -- and possibly the target value -- to its symbolic form.
Enumerate all of your critical words for each position in a lexicon. Then use basic string manipulation operators in your chosen implementation language to find the three needed fields.
EXAMPLE
Given the data above, your lexicons might be like this:
column_trigger = "Values in the column"
relation_dict = {
"are bigger than" : ">",
"are smaller than" : "<",
"contain" : "LIKE",
...
}
value_desc = {
"numeric characters" : "%[1-9]%",
...
}
From here, use these items in standard parsing. If you're not familiar with that, please look up the basics of a simple sentence grammar in your favourite programming language, with rules such as such as
SENTENCE => SUBJ VERB OBJ
Does that get you going?
I am attempting to write an algorithm that selects a specific reference standard (vector) as a function of temperature. The temperature values are stored in a structure ( procspectra(i).temperature ). My reference standards are stored in another structure ( standards.interp.zeroed.ClOxxx ) where xxx are numbers such as 200, 210, 220, etc. I have built the rounding construct and paste it below.
for i = 1:length(procspectra);
if mod(-procspectra(i).temperature,10) > mod(procspectra(i).temperature,10);
%if mod(-) > mod(+) round down, else round up
tempvector(i) = procspectra(i).temperature - mod(procspectra(i).temperature,10);
else
tempvector(i) = procspectra(i).temperature + mod(-procspectra(i).temperature,10);
end
clostd = strcat('standards.interp.zeroed.ClO',num2str(tempvector(i)));
end
This construct works well. Now, I have built a string which is identical to the name of the vector I want to invoke, but I'm uncertain how to actually call the vector given that this is encoded as a string. Ideally I want to do something within the for-loop like:
parameters(i).standards.ClOstandard = clostd
where I actually am assigning that parameter structure to be the same as the vector I have saved in the standards structure I have previously generated (and not just a string)
Could anyone help out?
Don't construct clostd like that (containing the full variable name), make it contain only the last field name instead:
clostd = ['ClO' num2str(tempvector(i))];
parameters(i).standards.ClOstandard = standards.interp.zeroed.(clostd);
This is the syntax of accessing a structure's field dynamically, using a string. So the following three are equivalent:
struc.Cl0123
struc.('Cl0123')
fieldn='Cl0123'; struc.(fieldn)
I am currently attempting to parse data that is sent from an outside source serially. An example is as such:
DATA|0|4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_
This data can come in many different lengths, but the first few pieces are all the same. Each "piece" originally comes in with CRLF after, so I've replaced them with string.gsub(input,"\r\n","|") so that is why my input looks the way it does.
The part I would like to parse is:
4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_
The "4" tells me that there will be four lines total to create this file. I'm using this as a means to set the amount of passes in the loop.
The 7x5 is the font height.
The 1 is the xpos.
The 25 is the ypos.
The variable data (172-24 in this case) is the text at these parameters.
As you can see, it should continue to loop this pattern throughout the input string received. Now the "4" can actually be any variable > 0; with each number equaling a set of four variables to capture.
Here is what I have so far. Please excuse the loop variable, start variable, and print commands. I'm using Linux to run this function to try to troubleshoot.
function loop_input(input)
var = tonumber(string.match(val, "DATA|0|(%d*).*"))
loop = string.match(val, "DATA|0|")
start = string.match(val, loop.."(%d*)|.*")
for obj = 1, var do
for i = 1, 4 do
if i == 1 then
i = "font" -- want the first group to be set to font
elseif i == 2 then
i = "xpos" -- want the second group to be set to xpos
elseif i == 3 then
i = "ypos" -- want the third group to be set to ypos
else
i = "txt" -- want the fourth group to be set to text
end
obj = font..xpos..ypos..txt
--print (i)
end
objects = objects..obj -- concatenate newly created obj variables with each pass
end
end
val = "DATA|0|4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_"
print(loop_input(val))
Ideally, I want to create a loop that, depending on the var variable, will plug in the captured variables between the pipe deliminators and then I can use them freely as I wish. When trying to troubleshoot with parenthesis around my four variables (like I have above), I receive the full list of four variables four times in a row. Now I'm having difficulty actually cycling through the input string and actually grabbing them out as the loop moves down the data string. I was thinking that using the pipes as a means to delineate variables from one another would help. Am I wrong? If it doesn't matter and I can keep the [/r/n]+ instead of each "|" then I am definitely all for that.
I've searched around and found some threads that I thought would help but I'm not sure if tables or splitting the inputs would be advisable. Like these threads:
Setting a variable in a for loop (with temporary variable) Lua
How do I make a dynamic variable name in Lua?
Most efficient way to parse a file in Lua
I'm fairly new to programming and trying to teach myself. So please excuse my beginner thread. I have both the "Lua Reference Manual" and "Programming in Lua" books in paperback which is how I've tried to mock my function(s) off of. But I'm having a problem making the connection.
I thank you all for any input or guidance you can offer!
Cheers.
Try this:
val = "DATA|0|4|7x5|1|25|174-24|7x5|1|17|TERW|7x5|1|9|08MN|7x5|1|1|_"
val = val .. "|"
data = val:match("DATA|0|%d+|(.*)$")
for fh,xpos,ypos,text in data:gmatch("(.-)|(.-)|(.-)|(.-)|") do
print(fh,xpos,ypos,text)
end
If have been cracking my head over this for a week now:
We have an assignment, where we have 2 options in our program, with option 1, the program asks for a name and a date, and then it generates an email addressed to the give name, with that date.
The second option, we have to paste text in to program, and it will tell us if the 'template' from option 1 is used or not, and it gives you the name, and date.
my question is now: how do I compare the given string, with the manual input string and make that name, and date (could be 2nd of oktober, could be 10/02, could be sunday the 2nd, basically anything that isn't the same as the template) and still make it say the template matches?
I thought: cutting the strings up, comparing them, word for word, but then what? and how?
Since I do not know what language you are programming in, I will give you some examples of what you ask in the languages I know.
Python(2.7):
x = raw_input('Manual String') // get user input, this can be replaced
y = 'this is a string: '+ str(x) // use Str incase of a number or other format of x.
if(y == 'this is a string: doubleo'):
print "The strings are equal!"
C:
use this page:
http://www.tutorialspoint.com/ansi_c/c_strcmp.htm
As I asked in my previous question(Link) about concatenating a multipart string of variable lengths, I used the method answered there by rkhayrov and now, my function looks like this:
local sToReturn = string.format( "\t%03s\t%-25s\t%-7s\n\t", "S. No.", "UserName", "Score" )
SQLQuery = assert( Conn:execute( string.format( [[SELECT username, totalcount FROM chatstat ORDER BY totalcount DESC LIMIT %d]], iLimit ) ) )
DataArray = SQLQuery:fetch ({}, "a")
i = 1
while DataArray do
sTemp = string.format( "%03s\t%025s\t%-7d", tostring(i), DataArray.username, DataArray.totalcount )
sToReturn = sToReturn..sTemp.."\n\t"
DataArray = SQLQuery:fetch ({}, "a")
i = i + 1
end
But, even now, the value of score is still not following the order as required. The max length of username is 25. I've used %025s inside the while loop because I want the usernames to be right-justified, while the %-25s is to make the word UserName centre justified.
EDIT
Current output:
Required Output:
Displaying the list of top 5 chit-chatters.
S. No. UserName Score
1 Keeda 9440
2 _2.2_™ 7675
3 aim 7057
4 KGBRULES 6770
5 Guddu 6322
I think it's because of difference in fonts, but since most of the clients have Windows 7 default fonts(Tahoma/Verdana at 11px), I need optimum result for at-least that.
I think it's because of difference in fonts
It is. string.format formats by inserting whitespace. That only works for a fixed width fonts (i.e. all characters have the same width, including whitespace).
since most of the clients have Windows 7 default fonts(Tahoma/Verdana at 11px)
In what? How are they viewing your output? Do you write it to a textfile, that they then open in the editor of their choice (likely Notepad)? Then this approach will simply not work.
Don't know enough about your output requirements to steer you any futher, but it's worth noting that everyone has a browser so HTML output is very portable.
string.format doesn't truncate - the width of the field is minimum, not maximum. You'll have to truncate the strings to 25 characters yourself with something like DataArray.username:sub(0,25).
I'd remove the tabs from the string.format; and use the justification provided by %25s only. Won't be perfect but will probably be closer.
Use a fixed-width font if you can.