Splitting txt file into multiple based on custom delimiter - python-3.x

I have a txt file with around 100 multiple-choice questions. I want to split them into 100 txt files containing one question. The delimiter will be "number.", example - 1. for the first question, 2. for next like this. the issue is a "number." can emerge amidst a question also, so that can be checked by ensuring that "(d)" was encountered prior to that "number.".
Sample text -
1.Consider the following statements with regard to the State Council of Ministers:1. The constitution specifies the size of the state council of ministers and the ranking of ministers.2.The advice tendered by the ministers to the Governor shall not be inquired into in any court. Which of the statements given above is/ are correct? (a)1 only(b)2 only(c)Both 1 and 2(d)Neither 1 nor 22.Consider the following statements with reference to Public Accounts Committee:1. The committee was set up under the provisions of the Government of India Act1919.2.Speaker is the ex- Officio Chairman of the committee. A minister cannot be a member of the committee. Which of the statements given above is/are not correct? (a)1 only(b)2 and 3 only(c)1 and 2 only(d)2 only
The text is like mentioned above, and it should be split like this -
1.Consider the following statements with regard to the State Council of Ministers:1. The constitution specifies the size of the state council of ministers and the ranking of ministers.2.The advice tendered by the ministers to the Governor shall not be inquired into in any court. Which of the statements given above is/ are correct? (a)1 only(b)2 only(c)Both 1 and 2(d)Neither 1 nor 2
2.Consider the following statements with reference to Public Accounts Committee:1. The committee was set up under the provisions of the Government of India Act1919.2.Speaker is the ex- Officio Chairman of the committee. A minister cannot be a member of the committee. Which of the statements given above is/are not correct? (a)1 only(b)2 and 3 only(c)1 and 2 only(d)2 only

Related

How to use Qualified Number Restrictions in Protege for OWL

I have the following two English sentences
(1) Each student has to follow some courses and, specifically,
undergraduate students can follow only undergraduate courses but postgraduate
students can follow postgraduate courses and up to two undergraduate courses.
(2) Each student can have only a single mark for each course.
I have created an ontology in Protege tool but I don't know how to pass the qualified restriction "up to two" or exactly one "single".
What I have done so far:
First I created four classes "Undergraduate Courses", "Undergraduate Student", "Postgraduate Courses", "Postgraduate Student" and an ObjectProperty named followCourse. Based on the first (1) English sentence, Undergraduate students can follow only undergraduate courses. However, Postgraduate students can also have postgraduate courses and some undergraduate courses. Thus, I have written to following for Postgraduate Student class:
followCourse some Postgraduate_Courses or Undergraduate_Courses.
I am not sure that I satisfy the threshold up to two (≤2) courses because some in description logic is at least one (screenshot 1).
For the second sentence about "single mark" I have added the following in Undergraduate_courses and Postgraduate_courses classes:
hasMark max 1 where hasMark is a DataProperty for Curriculum class (screenshot 2).
I believe the hasMark max 1 is wrong in this case because with that expression I typically say that I can't have the same course with two marks for two separate students. This is wrong because two students could have taken the same course and have separate marks.
(screenshot 1)
(screenshot 2)
Sentence 1
As for sentence 1 your are correct regarding using max 2 Undergraduate_Courses.
However, watch out for
isFollowedBy some Undergraduate_Student or Postgraduate_Student.
As you have it, it means
(isFollowedBy some Undergraduate_Student) or (Postgraduate_Student)
but what you really want is
isFollowedBy some (Undergraduate_Student or Postgraduate_Student).
Also, for Undergraduate_Student you specify followCourse only Undergraduate_Course which will correctly ensure that undergraduate students can only take undergraduate courses. However, as is, it will allow for undergraduate students that take zero undergraduate courses. Thus, to force undergraduate students to take at least 1 undergraduate course you have to change the statement as follows:
followCourse only Undergraduate_Course and followCourse some Undergraduate_Course.
Sentence 2
Here it is probably best to introduce a new concept, say StudentCourseMark which is used to link a single student, a single course and a single mark. Assuming you have the object properties hasStudent and hasCourse and data property hasMark you can define StudentCourseMark as the subclass of
hasStudent exactly 1 Student and
hasCourse exactly 1 Course and
hasMark exactly 1 xsd:integer
However, this can still allow for the possibility that you can have 2 different individuals of StudentCourseMark that have the same student and course, but different marks. To avoid this you can add a key on student and course for StudentCourseMark:
HasKey:
hasCourse,
hasStudent

is it possible to extract "Description" of an ecoinvent activity?

I'm trying to automate the extraction of the "Description" of activities from ecoinvent database (e.g., "wood preservation facility construction, dipping/immersion tank | wood preservation facility, dipping/immersion tank"), using brightway2.
As there seems no such an attribute called "description" in the activity object, I tried "comment" instead. I was expecting the following content from the activity named "wood preservation facility construction, dipping/immersion tank | wood preservation facility, dipping/immersion tank":
"The dipping or immersion tank consists of the main steel tank containing the preservative and an second, larger steel tank as the retention tank, within which the first tank is located. As an alternative design the main tank can be placed in a sealed concrete construction, serving as the retention tank. Appart from the tanks, a lifting device (for lifting the wood into and out of the tank) as well as a hold-down device (to ensure complete immersion) are inventoried.;The length of the inventoried tank is 12 m, corresponding to a medium sized tank, The batch size is between 12 - 16 m3 of wood. Assuming an average input of 9 l of preservative/m3 , a yearly throughput of 6000 m3 of wood and a service life of 75 years, about 4.05E06 kg of wood preservative are applied over the service life of the tank.;;Data on land use and on the building hall are taken from pressure vessel processes as no specific data on these infrastructure inputs could be gathered. These values refer to a yearly throughput of 6000 m3 of wood.; Infrastructure dataset containing the main elements of the facility. The land area and the material (production and disposal) needed for a preservative treatment plant for dipping or immersion treatment including strorage area is included. 60 - 90 kg of 2-component laquers are not inventoried." The content is presented in the "Description" field under "General Information" section, if you open the activity in OpenLCA software.
But what was returned was the following:
"Data from Germany and Switzerland. Location of plant within Europe of minor importance" (which is the content of another "Description" field under "Geography" section, if you open the activity in OpenLCA software).
Is there any way I can retrive the correct Description content of an activity using bw2? Thanks

Real inventory vs Finance inventory

I'm pretty new to Maximo and I have a question about how things should be handled. We have a project of integrating our ERP with Maximo using MIF interfaces. I'd like to know how you performed this in your company as I'm sure we're not the only one facing this challenges.
Let's go pretty simple about the Inventory. Receiving will be performed in our ERP and sent to Maximo. "Consumption" and inventory movements will be performed in Maximo and interfaced back to our ERP. so far so good, standard process.
However, there's a difference in the real inventory vs the inventory from a Financial perspective. As an example, we have let's say 3 cutters (same part number, different serial number). We have 2 in the internal storage and 1 in the machine where its being used. Once it is in the machine, it's known as "consumed" by finance. So from a finance point of view, there's only 2 in stock. From the tooling guy however, there's 3! The one in the machine can be taken out and replaced by one of the two others.
How are things like that handled in maximo? Any help, advices would be appreciated.
Regards
M.
It depends on how you plan to track on the individual level (track cutters by individual asset tags or serial numbers) or on the group level (3 cutters with the same item number)
If the cutter is treated as inventory and not as an asset, once it is issued then it is considered used. You can opt to use "Condition Code" and give each item a value from 0 to 100. Think of this value as depreciation. If a cutter is swapped out, then the new cutter (100%) is issued, and the replaced cutter is disposed of or returned to the shelf with a diminished value (e.g. 25%).

How to break line into known words

I need to break a line of string into different columns into excel. Here is te input that i get.
Input:
37006 II Semester P.G. Diploma in Clinical Research and Clinical Data Management Examination, July/August 2012 Pharma Regulatory Affairs Time : 3 Hours Max. Marks : 100
Output: CSV record with structure (Code, Sem/Year, Subject, Course, Exam Date, Time, Marks)
37006 , II Semester, P.G. Diploma in Clinical Research and Clinical Data Management, Pharma Regulatory Affairs, July/August 2012 , 3 Hours , 100
I have data in different sets which constructs above lines. For example:
Grammar (this is an array / dictionary):
Semesters[I,II,III,IV,V,VI,VII,VIII,IX,X,1,2,3,4,5,6,7,8,9,10]
Years[I,II,III,IV,V,VI,VII,VIII,IX,X,1,2,3,4,5,6,7,8,9,10]
Subjects[P.G. Diploma in Clinical Research and Clinical Data Management, LL.B]
Courses[Pharma Regulatory Affairs,Law - Jurisprudence]
ExamDates[ July/August 2012 , Jan./Feb. 2013 ]
Time[3 Hours]
MaxMarks[30,40,50,60,70,80,90,100]
FYI,
I'm not sure that i can use any delimiters to break it as its highly unpredictable or dependable.
I'm not sure the text will be in same order in each line or no fixed length or cars or words
My assumption is, read word by word and try to match with any word in any array that I have. If its match with any word, then categorize that word into falling category and add into relevant column in excel.
Here, I know how to handle data and everything, except what is the optimized / best way to
understand each word falls under which category.
Is there any lexical analysis expert that can share some thoughts on this?
You should use regular expressions for matching such complicated text pattern.
Please take a look at a lexical analyzer like ANTLR. If you know Java or other languages that read regular expressions, you will be able to parse these with ease after an afternoon (or week) of torture. You can also write the regexp in Java, but I would nudge you toward the ANTLR interface, which you may use from Eclipse. It will show you how the lines are being parsed.
Have the output of the ANTLR or Java write out a CSV file. The CSV will get be your vehicle for getting your data into the Excel spreadsheet.

Information extraction. Counting mentions to measure relevance

Is it possible to count how many times an entity has been mentioned in an article? For example
ABC Company is one of the largest car manufacturers in the
world. It is also the largest
company in terms of annual production.
It is also the second largest exporter of luxury cars, after XYZ
company. Both ABC and XYZ
together produces over n% of total car
production in the country.
mentions ABC company 4 times.
Yes, this is possible. It's a combination of
named-entity recognition (NER), which for English is practically a solved problem, and
coreference resolution, which is the subject of ongoing research (but give this package a try)

Resources