I am a beta tester for a hockey game and the csv file has the home and away teams reversed. The day, month, and year are correct though.
This:
20;1;1995;Toronto Maple Leafs;Los Angeles Kings
20;1;1995;Buffalo Sabres;New York Rangers
20;1;1995;St. Louis Blues;San Jose Sharks
20;1;1995;Pittsburgh Penguins;Tampa Bay Lightning
20;1;1995;Dallas Stars;Vancouver Canucks
20;1;1995;Calgary Flames;Winnipeg Jets
To this:
20;1;1995;Los Angeles Kings;Toronto Maple Leafs
20;1;1995;New York Rangers;Buffalo Sabres
20;1;1995;St. Louis Blues;San Jose Sharks
20;1;1995;Tampa Bay Lightning;Pittsburgh Penguins
20;1;1995;Vancouver Canucks;Dallas Stars
20;1;1995;Winnipeg Jets;Calgary Flames
Of course this just a small sample...
Any help would be greatly appreciated!
Thank you!
Try replace all (ctrl+h) with regular expressions enabled
Use (\d+;\d+;\d+;)([\s\S]+);([\s\S]+) for your to replace value and $1$3;$2 for your replace with value
Can't test it right now as I dont have Notepad++ installed on this computer... Tested it in Sublime and it worked.
I would just do it manually after copying it to notepad regular then paste it all back again
find____________
replace________________
find toronto
replace otnorot
)replace all parameter's(
etc
edless it is for a code to do it automatically
then your goin to need allot more integration for the compilation
I had a similiar qeustion before!
Related
I am trying to use Open AI API to create quiz questions with three incorrect answers and one correct answer. The prompt I use is
`Write a quiz on ${quiz_topic} with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer. Write each answer on a separate line`
I would like to parse ChatGPT's answer but the format is inconsistent. For instance, is quiz_topic is "literature", I get the following response:
Q1. Who wrote the novel "The Catcher in the Rye"?
A. J.D. Salinger
B. Ernest Hemingway
C. F. Scott Fitzgerald
D. Mark Twain
This is the correct format. But if I use "U.S History", I get the following output:
Q1: Who was the first president of the United States?
A1: George Washington Jr., Thomas Jefferson, John Adams, George Washington
This is a different format. So my goal is to fine-tune it using the article here
I have created 200 line of a training set. Here is a sample:
{"prompt":"Write a quiz on general knowledge with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer.Write each answer on a separate line \n\n---\n\n","completion":" Q. What is the Italian word for "tomato"?\nA. Aglio\nA. Cipolla\nA. Peperoncino\nA. Pomodoro\n\nQ. Which restaurant's mascot is a clown?\nA. Whataburger\nA. Burger King\nA. Sonic\nA. McDonald's\n\nQ. Which of these colours is NOT featured in the logo for Google?\nA. Yellow\nA. Blue\nA. Green\nA. Pink\n\nQ. In 2013 how much money was lost by Nigerian scams?\nA. $95 Million\nA. $956 Million\nA. $2.7 Billion\nA. $12.7 Billion\n\nQ. What is the name of Poland in Polish?\nA. Pupcia\nA. Polszka\nA. Póland\nA. Polska\n\n \n\n###\n\n"}
{"prompt":"Write a quiz on books with 5 questions. Each question has 3 incorrect answers and 1 correct answer. The correct answer is always the last answer.Write each answer on a separate line \n\n---\n\n","completion":" Q. What is the name of Eragon's dragon in "Eragon"?\nA. Glaedr\nA. Thorn\nA. Arya\nA. Saphira\n\nQ. In the "The Hobbit", who kills Smaug?\nA. Bilbo Baggins\nA. Gandalf the Grey\nA. Frodo\nA. Bard\n\nQ. What is Hermione Granger's middle name?\nA. Jane\nA. Emma\nA. Jo\nA. Jean\n\nQ. According to The Hitchhiker's Guide to the Galaxy book, the answer to life, the universe and everything else is...\nA. Loving everyone around you\nA. Chocolate\nA. Death\nA. 42\n\nQ. What is the name of the three headed dog in Harry Potter and the Sorcerer's Stone?\nA. Spike\nA. Poofy\nA. Spot\nA. Fluffy\n\n \n\n###\n\n"}
When I run the validation tool with the command
openai tools fine_tunes.prepare_data -f training.jsonl
I get the following message
- All prompts start with prefix `Write a quiz on `. Fine-tuning doesn't require the instruction specifying the task, or a few-shot example scenario. Most of the time you should only add the input data into the prompt, and the desired output into the completion
I don't understand why I must remove "Write a quiz on". So I have misunderstood how to fine-tune a model for consistent formatting.
Can anybody shed a light on how to make sure I get the same formatting with the same prompt
I have data in a column of university credentials. I essentially need to search through credentials to find doctorate level or equivalent degrees and extract the text block including the degree and university up to a specific character. At this point I have a nicely working formula that is able to do this well for one input degree only.
=LEFT(RIGHT(A1,LEN(A1)-SEARCH("Ph.D",A1)+1),SEARCH(")",RIGHT(A1,LEN(A1)-SEARCH("Ph.D",A1)+1))-1)
So, for example, if I want to find Ph.D. in a text string and extract that block up to the end of the university, it functions as follows:
RawUniverities
ParsedDoctoralLevel using formula
B.Sc. (Hons.)(Western Ontario), Ph.D.(British Columbia), LLM (Regina)
Ph.D.(British Columbia
B.A.(Simon Fraser), M.A.(Victoria), DPsych (British Columbia), MD (Oxford)
B.Sc.(Trent), M.Sc.(Alberta), D.Phil.(Western)
B.Sc.(Hons.) (Guelph),Ph.D (Queen's), MD(Toronto)
Ph.D (Queen's
The issue I'm having is I need to search for multiple different doctoral level degrees and then extract the block of text in the same manner with the degree and university. So, for this table example, the formula would need to search for Ph.D, MD, DPsych, and D.Phil, to capture everything. In instances where there are two degrees in the list, taking the leftmost would suffice.
An additional note, I have tried to solve this using the following formula, in theory it works but it spills over into the multiple columns, which just does not work for the parsing I'm doing.
=LEFT(RIGHT(A1,LEN(A1)-SEARCH({"Ph.D","MD","DPsych","D.Phil"},A1)+1),SEARCH(")",RIGHT(A1,LEN(A1)-SEARCH({"Ph.D","MD","DPsych","D.Phil"},A1)+1))-1)
Wondering if anyone can help with a streamlined solution to this. I've hit the limits of my capabilities in excel with this and deep dives online have brought up very little.
It might be worth noting that the list of degrees that will need to be search could be upwards of 50 long. Not sure if the influences a best solution for this?
Edits: adding data to the table, having formatting issues.
Additional raw university test data if interested:
B.A.(Winnipeg Bible College), M.A.(London Bible College) M.A., Ph.D.(Toronto)
B.Com.(Hons.), Ph.D.(British Columbia)
B.Sc.(Birmingham), M.A.(Toronto), Ph.D.(London)
B.A. (Hons.)(Queen's), M.A.Sc. (Waterloo), Ph.D. (Queen's)
B.A. (York), M.A. (Western Ontario), Ph.D. (Rice)
BSc PT, (JUST), PT MSc (NYIT), Ph.D. (McGill)
M.F.A. (Concordia), M.A. (GCU Lahore)
Ph.D. (Queen's)
B.Sc. (Queen's), Ph.D.(Toronto)
B.A.(British Columbia), LL.B.(Queen's). LL.M.(California)
B.Sc., MD (Ottawa)
B.A., M.A. (Toronto), Ph.D. (Harvard)
LL.B.(Toronto), M.Sc., LL.M.(Queen's), D.Phil.(Oxford)
B.A.(McGill), J.D. (Queen's)
B.A. (Wilfrid Laurier), M.A. (Queen’s), M.A., Ph.D (Princeton)
M.Phil.(London), MN(Dalhousie), Ph.D.(Texas)
B.A. Hons (Wilfrid Laurier), M.A. (McMaster), Ph.D. (York)
B.Sc., M.Sc. (Isfahan), Ph.D. (Concordia)
B.A.(Keele) M.A., D.Phil.(Oxon)
L.S.M., B.A.(Toronto), J.D. (Queen's), LL.M.(Harvard)
B.A.(R.M.C.) M.A.(McGill) LL.B.(Queen's), LL.M.(London)
B.Sc., M.Sc., Ph.D. (Eӧtvӧs Loránd University, Budapest)
B.Sc., Ph.D. (British Columbia)
B.A., LL.B. (Toronto), S.J.D. (Harvard)
B.A.(Hons.)(Queen's), D.Phil.(Oxon)
B.Ac., MD (Argentina)
BA, MA, Ph.D. (Toronto)
I require a little help from you, I need to rename some folders that I have. The problem is that I need to move a part of the folder name and leave it at the beginning, like for example this:
Original:
1982-11-03 - Seibu Stadium, Saitama, Japan [FLAC]Space Boogie [version 2][Wardour][AUD][109.56][560.72MB]
Required:
1982-11-03 - Space Boogie - Seibu Stadium, Saitama, Japan [FLAC][version 2][Wardour][AUD][109.56][560.72MB]
So I need to move for example "Space Boogie" phrase into final date, the same way that example.
I know that is difficult but i've tried to make it using sed or anything and no good results
I have a list of approximately 4,000-odd ancient Chinese proverbs I would like to import into Pleco (a Chinese dictionary app) for flashcards. However, Pleco needs them in a specific format (a table separated by tabs) and to do so manually would take forever.
Any idea how I would implement a script to automatically format the list?
e.g.
Ài fàn yǒu fàn; xī yī yǒu yī.
爱饭有饭, 惜衣有衣。
愛飯有飯, 惜衣有衣。
[Those who] treasure [their] food [will always] have food [and those who] take care of [their] clothing [will always] have clothes [to wear].
[An admonition to thrift; see also bùyī nuǎn below.]
CLOTHING FOOD THRIFT
[A2]
Ái gǒu yǎo de rén bù dōu shì zéi.
挨狗咬的人不都是贼。
挨狗咬的人不都是賊。
(lit) Not all who are bitten by dogs are thieves.
(fig) One should not make judgments based on superficial appearances. Things are not always as they (first) appear.
APPEARANCES JUDGMENTS
[A3]
Áiguo shé yǎo, jiàn shàn pǎo.
挨过蛇咬, 见鳝跑。
挨過蛇咬, 見鱔跑。
(lit) [One who has been] bitten by a snake [at the] sight [of an] eel [will] run away.
(fig) “Once bitten, twice shy.”
[See also yīzhāo bèi shé yǎo below.]
EXPERIENCE LEARNING
into:
爱饭有饭, 惜衣有衣 愛飯有飯, 惜衣有衣 [Those who] treasure [their] food [will always] have food [and those who] take care of [their] clothing [will always] have clothes [to wear].
[An admonition to thrift; see also bùyī nuǎn below.]
挨狗咬的人不都是贼。 挨狗咬的人不都是賊。 (lit) Not all who are bitten by dogs are thieves.
(fig) One should not make judgments based on superficial appearances. Things are not always as they (first) appear.
挨过蛇咬, 见鳝跑。 挨過蛇咬, 見鱔跑。 (lit) [One who has been] bitten by a snake [at the] sight [of an] eel [will] run away.
(fig) “Once bitten, twice shy.”
[See also yīzhāo bèi shé yǎo below.]
It needs to be in the form: Simplified characters TAB Traditional characters TAB TAB Definition
Please leave out the Pinyin (e.g. Ài fàn yǒu fàn; xī yī yǒu yī.), the identifier (e.g. [A1], [A2]) and the last line with the topics (e.g. CLOTHING FOOD THRIFT)
Further down the code the lines will not always be exact. Sometimes there will be more or less lines per proverb.
Thank you all so much! StackOverflow has been a huge help to me in my coding adventures.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have a csv file for lyrics songs that I took from Genius. Right now, I m preparing my data. I have two column "songs" and "artist". In the "songs" columns I have a lot information: title, album, year, lyrics and URL. I need to separate the column "songs" in 5 columns.
Then I tried to split the data by comma like this:
df = pd.read_csv('output.csv', header=None)
df = pd.DataFrame(df[0].str.split(',').tolist())
But with this code, I got 122 columns, because all the time that I have a comma in lyrics was created other column:
I guess I have to keep all my lyrics inside of double quotes, then if I split by comma the full lyric will remains in one single column.
Someone know how I can do that?
Adding 1 sample of the data:
songs,artist
"{'title': 'Berzerk', 'album': 'The Marshall Mathers LP 2', 'year': '2013-08-27', 'lyrics': '[Verse 1]\nNow this shit\'s about to kick off, this party looks wack\nLet\'s take it back to straight hip-hop and start it from scratch\nI\'m \'bout to bloody this track up, everybody get back\nThat\'s why my pen needs a pad, \'cause my rhymes on the rag\nJust like I did with addiction, I\'m \'bout to kick it\nLike a magician, critics I turn to crickets\nGot \'em still on the fence whether to picket\nBut quick to get it impaled when I tell \'em, ""Stick it!""\nSo sick I\'m looking pale, wait, that\'s my pigment\n\'Bout to go ham, ya bish, shout out to Kendrick\nLet\'s bring it back to that vintage Slim, bitch!\nThe art of MCing mixed with da Vinci and MC Ren\nAnd I don\'t mean Stimpy\'s friend, bitch\nBeen Public Enemy since you thought PE was gym, bitch\n\n[Pre-Chorus]\nKick your shoes off, let your hair down\n(And go berserk) all night long\nGrow your beard out, just weird out\n(And go berserk) all night long\n\n[Chorus 1]\nWe\'re gonna rock this house until we knock it down\nSo turn the volume loud\n\'Cause it\'s mayhem \'til the A.M.\nSo, baby, make just like K-Fed\nAnd let yourself go, let yourself go\nSay ""Fuck it!"" before we kick the bucket\nLife\'s too short to not go for broke\nSo everybody, everybody, go berserk, grab your vial, yeah\n\n[Verse 2]\nGuess it\'s just the way that I\'m dressed, ain\'t it?\nKhakis pressed, Nike shoes crispy and fresh laced\nSo I guess it ain\'t that aftershave\nOr cologne that made \'em just faint\nPlus I showed up with a coat fresher than wet paint\nSo if love is a chess game, check mate\nBut girl, your body\'s bangin\', jump me in, dang, bang-bang\nYes siree \'Bob\', I was thinking the same thang\nSo come get on this Kid\'s rock, baw with da baw, dang-dang\nPow-p-p-p-pow, chica, pow, chica, wow-wow\nGot your gal blowin\' up a valve, valve-valve\nAin\'t slowin\' down, throw in the towel, towel-towel\nDumb it down, I don\'t know how, huh-huh, how-how\nAt least I know that I don\'t know\nQuestion is, are you bozos smart enough to feel stupid?\nHope so, now ho…\n\n[Pre-Chorus]\nKick your shoes off, let your hair down\n(And go berserk) all night long\nGrow your beard out, just weird out\n(And go berserk) all night long\n\n[Chorus 2]\nWe\'re gonna rock this house until we knock it down\nSo turn the volume loud\n\'Cause it\'s mayhem \'til the A.M.\nSo crank the bass up like crazy\nAnd let yourself go, let yourself go\nSay ""Fuck it!"" before we kick the bucket\nLife\'s too short to not go for broke\nSo everybody, everybody, go berzerk, get your vinyls!\n\n[Scratch]\n\n[Verse 3]\nThey say that love is powerful as cough syrup in styrofoam\nAll I know is I fell asleep and woke up in that Monte Carlo\nWith the ugly Kardashian, Lamar, oh\nSorry yo, we done both set the bar low\nFar as hard drugs are though, that\'s the past\nBut I done did enough codeine to knock Future into tomorrow\nAnd girl, I ain\'t got no money to borrow\nBut I am tryin\' to find a way to get you alone: car note\nOh, Marshall Mathers\nShithead with a potty mouth, get the bar of soap lathered\nKangol\'s and Carheartless Cargos\nGirl, you\'re fixin\' to get your heart broke\nDon\'t be absurd, ma\'am, you birdbrain, baby\nI ain\'t called anybody baby since Birdman, unless you\'re a swallow\nWord, Rick? (Word, man, you heard)\nBut don\'t get discouraged, girl\nThis is your jam, unless you got toe jam\n\n[Pre-Chorus]\nKick your shoes off, let your hair down\n(And go berserk) all night long\nGrow your beard out, just weird out\n(And go berserk) all night long\n\n[Chorus 1]\nWe\'re gonna rock this house until we knock it down\nSo turn the volume loud\n\'Cause it\'s mayhem \'til the A.M.\nSo, baby, make just like K-Fed\nAnd let yourself go, let yourself go\nSay ""Fuck it!"" before we kick the bucket\nLife\'s too short to not go for broke\nSo everybody, everybody, go berserk, grab your vial, yeah', 'image': 'https://images.genius.com/a47bb228d28fd8a0e6e73abfabef7832.1000x1000x1.jpg'}",Eminem
Try this.
import ast
import pandas as pd
raw = pd.read_csv("output.csv")
raw["songs"] = raw["songs"].apply(lambda x: ast.literal_eval(x))
songs = raw["songs"].apply(pd.Series)
result = pd.concat([raw[["artist"]], songs], axis=1)
result.head()