How to create tabular output in python - python-3.x

Currently, I'm looking to scrape the signatures table from the edgar filings for specific companies. I have created a Python program to get down into each document and finds the tables that I need to scrape. I'm having trouble figuring out how to output the data to a file in a 'pretty' way.
Here's a link for a bit of a visual (just scroll to the bottom of the document, there will be a page of signatures there):
Example Document
What I'm looking to do is format the table, the same way it is formatted on the website, with each cell taking up a specific amount of space, and filling in unused space with... well, spaces!
My current output:
|Signature, Date, Title|
|/s/ Stanley M. Kuriyama| Chairman of the Board, February 29th, 2016|
|Stanley M. Kuriyama|
|/s/ Christopher J. Benjamin, President, Chief Executive, February 29th, 2016|
|Christopher J. Benjamin, Officer and Director (and so on...)|
|-----------------------------------------------------------------------------|
What I'm looking to do (periods are spaces):
|Signature......................,Title......................,Date...............|
|/s/ Stanley M. Kuriyama,.......,Chairman of the Board,.....,February 29th, 2016|
|Stanley M. Kuriyama............................................................|
|/s/ Christopher J. Benjamin....,President, Chief Executive,February 29th,2016..|
|Christopher J. Benjamin,.......,Officer and Director (and so on...)............|
|-------------------------------------------------------------------------------|
Is there any way to print out the string plus (maxSize -stringSize) number of spaces per cell, so the data looks more tabular? I'm looking to do this with the vanilla Python3, not additional downloads because the people using this program may not be as tech savvy as I am.

Related

Extract dates in various formats from free text string in excel

I'm struggling with extracting dates from a free text field in Excel where the date could be in any number of formats due to human input.
Some examples of the entries are below but essentially they could be 30/6, 30/06, 30th June, 30/6/21, 30/06/21, 30/06/2021, 30-6, 30th, 30-06, tomorrow!
Alright; I probably can't do much about that last one and I can pull the date from the others if I know the format but I'm looking for something that would handle any and all of the permutations.
Example Data
Column A
Empire still haven't repaired the roof. Waiting on Empire to waterproof the roof then I can patch up the ceiling in 611. ETA 30/06/2021
ETA 11/6/21
floor boards need replaced
22/6/21 awaiting parts
have sanded the filler that was in the war and refilled ready for or sanding again
new bathroom floor to replace. need to source materials and have time to do job. full bathroom sub floor to replace. ETA 29/06
Engineer attended and found E122 error, reset and tested but fault still present. New EVA probe required. Part to be sent to site.
new toilet handle ordered. ETA 28th June
TV on order
refer to Colin Warner for further works and required parts
28/6 awaiting underlay
leak in room 315 fixed. room 215 will need a week to dry. ETA 30th June.
10 July
ceiling too wet to carry out any work. eta 5-07-2021
I know it's a big ask but if you know of a formula or VBA UDF that can handle all of those then I'd be eternally grateful.

Extract dates from text with spaCy in relation to a given date

I want to extract dates, given in text form like 'next week' or 'February' from a news article, given the date the article was published. I.e. if the article was published on Feb 13 2019 and 'next week' was mentioned in that article, I want the function to find Feb 20 2019 for 'next week'. Does anybody know how to do that? I was thinking of doing it with spaCy's entity finder and then manually writing a function for every 'DATE' instance, but there must be something better.
Here is my example:
text = """Chancellor Angela Merkel and some of her ministers will
discuss at a cabinet retreat next week ways to avert driving
bans in major cities after Germany's top administrative court
in February allowed local authorities to bar heavily polluting
diesel cars."""
article_date = '2019-02-13'
My ideal result would be something as the following:
ref_dates = {'next_week': '2019-02-20',
'february': '2019-02-01'}
With SUTime from CoreNLP this can be done quite easily:
https://github.com/FraBle/python-sutime

Using the Rank Function In Excel

Sorry if this has been answered and I feel it may have but I am struggling to find an answer that helps me to the point of success.
I have a basic spreadsheet for time trial results. The spreadsheet is for both men and women. Basically, points are awarded for the quickest times throughout the entire competitors on 30 second intervals which is fine (Cloumn N)...(I have managed this)
My question is - On top of this the top 7 men in ranked position is awarded additional bonus points and the top 3 (only because there is normally less women attending the events than men) women are also additional awarded bonus points.
I have set up a column to specify M or F (Column C) when a competitor is added, and also using RANK
=IF(G7=0,0,RANK(G7,$G$6:$G$36,1)-COUNTIF($G$6:$G$36,0))
on the times - Column K
But I am really struggling with how to use a formula to extract the top 7 men and top 3 women and award the points. Ie there will be a 1st place man - 7th but also a 1st place woman - 3rd. So in essence is there any way I can extract the two sets of rankings from the identification of F and M from the appropriate column.
At the moment I can only get the a basic ranking and using an IF(AND) statement I can return results to apply the bonus points if the conditions are matched but this doesnt help with identifying the rankings according to Male (1st-7th) or Female (1st-3rd)
You can also see on my screen dump that although I havent added the formula for assigning the female points that because of the conditions been met I dont have bonus points awarded for 5th place because I set sex to F which I was hoping someone could also help me with
Sorry for waffling but I have been toiling with this 3 days now and I am just going in circles
Really appreciate any reply
Just use COUNTIFS:
=IF(G6=0,0,COUNTIFS(C:C,C6,G:G,"<" & G6,G:G,"<>0")+1)
This will rank the like entries in C, thus giving two 1st, one male and one female.
To add for the Club just add another condition:
=IF(G6=0,0,COUNTIFS(C:C,C6,G:G,"<" & G6,G:G,"<>0",B:B,B6)+1)

Match one sheet figures with another on excel

Okay as you can see from the below screenshots. I have got 3 sheets in excel. 2014, 2013 and comparison. Basically in the comparison tab I intend to compare figures from 2014 with 2013 for several sites for my business. So if you check the "comparison" image you will see initials of where my business sites are located accross the UK, eg ABZ = Aberdeen etc. and I am analysing the revenue, cons and weight per week for every site, so as you can imagine quite alot of information. I have condensed it for this example.
But is there a formula that will allow me in the "comparison" pic to key in all the information from the "2014" , "2013" information without having to manually sit there and add the formala ='2013'!c5 wich gives me the value for the 1st week for revenue but if i want it for week 2 I have to change the formula manually to ='2013'!c13 . Check images for clarity.

Extracting data from a non-formatted string

I want to extract certain parts and be able to put it into a nice spreadsheet format. The important parts are the address, ward number, square feet, and price. I was going to try something really complicated in PHP(novice), but thought there might be an easier way.
The data looks like this:
243-467
1402 E. Mt. Pleasant Ave. 50th Ward approximately 1,416 sq. ft. more or less BRT# 502440300 Improvements: Residential Dwelling
JANET DENNIS C.P. October Term, 2007 No. 01082 $105,641.01 Morton R. Branzburg, Esq.
244-712A
5407 Chestnut St. - Premise A 60th Ward Apt 2-4 Unts 2 sty Masonry; Improvement Area 4,610 sq. ft. BRT# 603011200 Improvements: Residential Dwelling
ALEXANDER TALMADGE, JR. (WHO HAS 1/3 INTEREST), BERNADINE ABAD AND BERNARD BLAIR TALMADGE $32,153.00 Drew Salaman, Esq.
Where does the data come from? Can you modify its output? If so, try outputting CSV text (http://en.wikipedia.org/wiki/Comma-separated_values). Excel will import CSV files.

Resources