linking info of pairs of respondents (couples) in SPSS - statistics

I am preparing for analyses of the determinants of partner choice in SPSS, but basically I can't get off the ground because I don't know how to create new variables based on the information of each respondent's spouse (i.e. education, wages, social background, ethnicity etc.).
Each respondent is currently identified by an ID#, and exist two places in the matrix: as unit/respondent and as a spouse (either wife or husband), instantiated in a different column. What I need is to use info from each row of variables pertaining to an individual as respondent - to create new variables in the row of each person's spouse.
If it helps, I also have a separate file with all couples linked row-wise, as variables of the same unit - evidently with the same ID# as in my "variables-file" (yesterday, however, I merged these files - hopefully correctly...).

Assume you have a variables called age, weight, id and spouse_id in a file called main.sav
Make a copy of main.sav and call it spouse.sav
Relabel all the content variables in spouse.sav: i.e., make age spouse_age and weigh spouse_weight
Remove the variable id from spouse.sav
Open main.sav and merge in spouse.sav on the common ID spouse_id. I.e., data - merge files - add variables

Related

In Excel, how do I count the number of times a pair of phrases in Column B are associated with a unique ID in Column A?

Hoping this is something Excel can do. I work in a pharma lab, and have been assigned a task of totaling up the number of Freeze/Thaw cycles each sample/vial has gone through in our labs for a particular study. To do this, I exported a storage report from our database and pasted the pertinent info in a new worksheet, see below. Column A is a list of the barcoded labels adhered to each specimen. Column B is a list of the transfers each sample has gone through during its life cycle here.
What I am needing: the number of "Move to Thaw" / "Move to Freeze" (f/t cycles- one pair of move to thaw/move to freeze combos = 1 f/t cycle) associated with each Custom Id. Not all Custom IDs will have a F/T cycle, and some may have up to 7. All other transfer types are not important.
So....is this something Excel can do for me? There are 2000 lines in this workbook that I'd rather not go through line by line...so far, I tried filtering out all the other transfer types, but then the ones with NO f/t cycle are excluded.

Creating a process to filter Specific Data via Rows and Columns in Excel

I'm trying to help a colleague with some work in Excel, he has a data-set of 40 Organisations of which each organisation has multiple Key Personal (KP). For each of these KP there has been an assessment against 3 key areas of criteria (where they are given a Y or N), these criterion being:
Geographic Area (Broken down into 26 Geographic Areas)
Industry Experience (Broken down into 18 Industries)
Areas of Expertise (Broken down into 18 Areas)
An example of the data is shown in the screenshot is linked
What I am trying to achieve is set up a 'filter form' that will allow an individual to put in their requirements (e.g. Aged Care Experience, in All of the West Region) and be provided with an output of the organisations that fit this criteria.
I have attempted to achieve this via utilizing a Pivot table, but have had no luck due to the different criteria and the fact that each organisation has multiple KP.
Any assistance would be much appreciated as to whether this can actually be achieved in Excel and how it could be done. If it can't I was thinking whether an Access Database could be used.
Update:
Please see attached the example data extract as requested by donPablo
Data Extract
From discussions with my Colleague the best outcome for him would be to get the Supplier, The KP and the other Criteria (think of it as filtering to hide all the Organisations and KPs expect the ones that meet the criteria).
if this is not achievable I can imagine that having the name of the organisation and KP as the output (that meet the criteria) would be suffice.
Think about maintenance of the ExampleData...
Adding a new Industry. Adding a new Expertise.
Splitting Industry into 3 Industry-s
Adding new Org with 2 KP
Deleting old KP3 from an org
For now with the initial concept, changes are small.
But soon in growth period there will be many changes.
How do you distribute these changes to all the users?
Thus, some sort of Split solution is needed.
A back-end DB (XLS or MSACCESS or SQLSERVER) ,
and a front-end form for--
Selection(s)
Results
Back-end as XLS could still be as ExampleData...
To be kept in central office.
And a front-end that links or references that db
but does not contain all the detail rows.
I think that the main matrix needs another column
called AreaType, value G or I or E
and that the area heading row needs to say
'ANY Geo" and have all "Y"-s in each column, etc for I and E.
In searching the matrix for Aged Care we should only look at Industry.
The ANY row would be chosen when the user does not choose an area.
I think that "Org" is a separate table
And that "KP" is another separate table.
This allows full details to be stored elsewhere
than the main matrix of areas.
Column heading of matrix would be "Org#~KP#", which would be
parsed on the tilde and separately looked up.
(it is improbable that any org or kp will have a tilde).
Yes, it is possible to search the matrix and retrieve qualified rows.
For ncol = minCol to maxCol
CountYInG = 0: CountYInI = 0: CountYInE = 0:
For each AreaType G, I, E
' then look at what was selected (gggg/iiii/eeee)
For each AreaName in (gggg/iiii/eeee)
If matrix = "Y" then add 1 to Count
next
next
if CountYInG > 0 and CountYInI > 0 and CountYInE > 0 then
This Org/KP qualifies
endif
next
added Pi Day, 20:00
First inclination is NOT to have 3 criteria tables (G/I/E), but rather ONE table.
Lets make several alternative DB designs. Then look at usage, and rank them.
Finally choose one and do it. Good luck, and Bye.
Matrix alternative
MatrixTable--AreaType & AreaName (PK), and one attribute Column for each Org/KP with value 'Y' or blank.
1st row has PK=C-ColHeadings, and each Column has Org#/KP# for that column.
OrgTable--Org# (PK), and OrgName, OrgStreet1, OrgStreet2, OrgCity/State/Zip, OrgPhone, ...
KPTable --KP# (PK), and KPName, KPOrg#, KPPhone
Normalized alternative (Admin would need to do pivot to see matrix view)
DetailTable--Org#(FK)-KP#(FK)-AreaType-AreaName(FK) DetailValue = 'Y' or ('Y' by implication of row existance)
OrgTable--Org# (PK), and OrgName, OrgStreet1, OrgStreet2, OrgCity/State/Zip, OrgPhone, ...
KPTable --KP# (PK), and KPName, KPOrg#, KPPhone
AreaTable--AreaType-AreaName(PK) (so that everyone spells it the same)
Your favorite design... list the tables, and their fields

Partial Index Match One Long Cell Value to Values from 3 Different Columns

Hi guys I have no idea how to do it at all since I probably don't get the logic behind building a formula for it so I'm going to ask here instead.
I'm administrating students' photos and have to collect them, and rename them to a consistent format. Basically, I have to do the following:
List the photo names by writing a .bat file using a simple command like dir /b *.jpg *jpeg *png *tiff >ClientList.txt
I create an Excel sheet to look for the students' names and student ID through a main database by performing Index Match
I Concatnate the results into a batch of Commands to be placed at Command Prompt to rename every photo at once
As of now, I am stuck at Step 2, doing them manually, because many students did not follow the guidelines and sent in their photos with sometimes only students IDs, sometimes First name, Last Name, sometimes Last Name, First Name; and sometimes a mix of name and student IDs. Also, students use names coming in different languages (Chinese and English)
As such, I want to have a way for Excel to search for the columns that contain English Names, Chinese Names and Student ID and return the closest match so that I don't have to identify them one by one.
I hope my question can be understood as I am not too sure how to explain it thoroughly.

Generating a unique identifier list with multiple tables & criteria

I'm not a coder, just someone who uses excel for basic estimating functions at work. However I've found myself in need of a complex list or index system.
Background/Intent: (Skip below if doesn't matter.) In an apartment building construction they build buildings like opened books - mirror images of 2 bed 2 bath apartments, for example. There is a standard "typical" unit and then the mirror image across the hall, the "reverse" unit. The door swings are all opposite from one to the other. My job is to figure out how to give each door a unique identifier code based on: Bldg No., Unit Type, Door No., Door Swing (left or right.) The raw data tables are provided below.
I've attempted to clean this up as much as possible, but there are two steps (I think) to this process.
Step 1:
The raw data table is on the left. My output field is on the right. I want to be able to select a drag down box, like data validation list, and select the building. Then a formula (which one?) spits out a list of every unit type per building. For example, Bldg 5 has 2 each of "A1 Typ." How do I get the formula to recognize that if there are 2 of them, to produce 2 separate lines for "A1 Typ." And so on and so forth until all 41 occurences/units have been accounted for and labeled appropriately. Some occur once, some multiple times, and some zero.
Step 1
From there, Step 2.
I want to use this output field again to automate another sequence, this time pulling from a different table, see picture. Now, depending on the unit type under the "type" column, I want it to expand each unit type showing each indivudual door number (1,2,3 etc through 12) and if it's an L (LH) or R (RH), and if there is more than one, to list out each occurence. (what formula?)
Then the decriptor text that will pop up under "DOOR LABEL" column would just be a joining of several fields to give a unique identifier. (suggestions?)
Step 2.
Easy right? Is this too much for excel, or can this be done?
Thanks so much for considering helping me out!

Book ordering comparison between spreadsheets for existing catalogue of a Library

I have recently asked this question of google's spreadsheet page.
I a significant data comparison problem I would like to solve. It relates to purchasing books for a Library. We have a catalogue of over 11,000 books. When we order new books we need to compare our proposed purchases to the current stock. Currently we can manually compare them to our catalogue, very laboriously book by book.
We need to do 3 things to make our life easier -
1 easily clean out bad data/characters in the ISBN's - these are either spaces, - (hyphen's) or . (period mark or full stops). A simple formula to run over all ISBN fields would be great.
2 I need to compare data between 1 spreadsheet with 11,000 books in it (current library stock), a second with up to 1000 books in it (currently on order) and finally the third currently active one (about to be ordered) with 50 to 200 books listed in it.
All spreadsheets use the same column configuration as below
Library orders
Title Author Publisher ISBN (long version) US$ UKgpd HK$ Other$ P/O no. Date ordered
UNNATURAL SELECTION MARA HVISTENDAHL Public Affairs Publishing; Reprint edition (May 1, 2012) 978610391511
Finally, the out put of these comparisons should quickly and easily identify on what lines we have matches. and what type of match it is, Author only, Author and Title, or Author, title and ISBN etc for all the possible combinations. To make this easier assume spreadsheet 1 is an unalterable master table, with spreadsheet two similar. It is really only on Spreadsheet 3 we need to be clear if we are starting to reorder materials.
If it is possible to have these as different sheets in a workbook it would be ideal. The only additional feature is that any scripts that run need to be able to cope with spreadsheet 1 increasing in size as new acquisitions arrive and are included. Both spreadsheets 2 and 3 will vary (increase and decrease) as the ordering process proceeds.
Finally the absolute ideal would be for this comparison process to be instant (live) and ongoing as data is included.
If anyone would like to take this on 3 Library staff will be eternally grateful.
regards
Nick
This would be very much easier had you one sheet rather than three (simply add a column to each existing sheet to show whether in stock, on order or to be ordered – three individual letters would be sufficient, then append each of the smaller two files to the largest). Then for example you could apply Conditional Formatting to highlight duplicates one column at a time (Author, Title etc). Apart from the initial data cleansing it would mean in the future switching ‘between sheets’ would merely involve changing a one-letter flag. Filtering would allow you and your colleagues to appear to have three separate sheets and if anyone asks for a particular Title the search would be one-time, not in triplicate.
Also, http://www.microsoft.com/en-gb/download/details.aspx?id=15011 may be of interest, also =SUBSTITUTE.And with data validation you would prevent entry of a new ISBN that already is in your list.

Resources