i have a query
select count(distinct imsi) from SS7 where sccpcalleddigits like '86133%' .
what does this sccpcalleddigits mean? I am completely new to this domain.
It is simply the called party digits - i.e. the number that is being called.
I am guessing this is clear but just for completeness - if Ann calls Bob, Ann's number is the calling number and Bob's is the called number.
More detailed:
SCCP includes a 'Called party address' and a 'Calling party address'.
These both have similar formats and can contain different information in different circumstances - from the ITU spec (Q.731 - you can usually find a PDF copy with a web search). The high level structure, from the ITU spec is:
The address indicator part is defined:
The "address indicator" indicates the type of address information contained in the address field (see Figure 4). The address consists of one or any combination of the following elements:
– signalling point code;
– global title (for instance, dialled digits);
– subsystem number.
So, assuming you mean by 'called digits' what the spec refers to as 'dialled digits' these are the number being called as above.
An SCCP called/calling address support multiple formats so if this is not what you are referring to then it might be worth looking at the 'Called Party Address' section of the spec for more background.
SCCP is one of layers in telco protocols stack used between telco exchanges (more precisely in this case - as we have IMSI - it must been a mobile MSC or HLR). So SCCP Address (in this case - Called Digits) is more an address of the target MSC rather than a number of Ann (but IMSI number could be hers).
Related
Situation:
I have been tasked with geocoding and plotting addresses to a map of a city for a friend of the family.
I have a list of over 400 addresses. Included in those addresses are PO Boxes and addresses with Street Number, Direction, Street Name, Street Suffix (some do not have this), City, and Zip Code.
I tried to geocode all of the addresses with Geopy and Nominatim.
Issue:
I noticed that those addresses without street suffixes and PO Boxes could not be geocoded.
What I have done:
I have read most posts dealing with addresses, read the Geopy notes and google searched until the cows came home.
I ended up stumbling across a geocoding website that PO boxes could not be mapped and that street suffix is required for mapping.
http://www.gis.harvard.edu/services/blog/geocoding-best-practices
Question:
Is there a way to search for the street suffix of each street that is missing a street suffix?
Is there another free service or library that can be utilized other than Nominatim and Geopy that can utilize the information I have and not require me to look up each individual street suffix in google maps?
Please advise!
I found out that using Geopy with Google's API can find the correct addresses that services like Nominatim, OpenCage and OpenMapquest will not fine.
There is one downside, the autocomplete can make it hard to determine if the address is the correct address.
First, speaking to the need to find an address that is missing a street suffix, you need to use address completion from an address validation service. Services that do address validation/verification use postal service data (and other data) and match address search inputs to real addresses. If the search input is not sufficiently specific, address validation services may return a handful of potential matches. Here is an example of a non-specific address (missing the State, zip code, and the street suffix) that returns two real addresses that match the search input. SmartyStreets can normally fill in the missing street suffix.
Second, speaking to the PO Box problem: some address services can give you geocode information, as well as other information that you may believe isn't available. For instance, this search shows the SmartyStreets service matching a PO Box number (that I just made up) to the local post office. The latitude and longitude in the response JSON corresponds to the post office when I search it on Google Maps.
Third, speaking to the problem of having a list of addresses: there are various address services that allow batch processes. For instance, it's a fairly common feature to allow a user to upload a spreadsheet of addresses. Here is the information page for SmartyStreets' tool.
There are multiple address services that can help you do all or some of these things. Depending on the service, they will provide some free functionality or have free tiers if you don't do very many searches. I am not aware of a service that does everything you need for free. You could probably use a few services together, like the Google Maps API to Geopy, etc, but it would take effort to code up a script to put them all together.
Full disclosure: I worked for SmartyStreets.
Actually arrival is pretty simple, tag gets into a range of receivers antenna, but the departure is what is causing the problems.
First some information about the setup we have.
Tags:
They work at 433Mhz, every 1.5 seconds they transmit a "heartbeat", on movement they go into a transmission burst mode which lasts for as long as they are moving.
They transmit their ID, transmission sequence number(1 to 255, repeating over and over), for how long they have been in use, and input from motion sensor, if any. We have no control over them whatsoever. They will continue doing what they do until their battery dies. And they are sealed shut.
Receiver forwards all that data + signal strength of a tag to our software. Software can work with several receivers. Currently we are using omnidirectional antennas.
How can we be sure that the tag has departed from premises?
Problems:
Sometimes two or more tags transmit "heartbeat" at the same time and no signal is received. With number of tags increasing these collisions happen more often, this problem is solved by tags randomly changing their heartbeat rate (in several milliseconds) to avoid collisions. Problem is I can't rely on tags not "checking in" for a certain period of time as sign of departure. It could be timeout because of collisions. Because of these collisions we cannot rely that every "heartbeat" will be received.
Tag manufacturer advised that we use two receivers and set them up as a gate for tags to pass through. Based on the order of tags passing through "gates" we can tell in which direction they are going. The problem with our omnidirectional antennas is that sometimes tag signal bounces of building and then arrives to receiver. So based on signal strength it looks like its farther away then it is.
Does anybody have a solution of what we can do to have a reliable way of determining if tags are coming or leaving? Also we can setup antennas in different way as well.
I wrote the software that interprets data from receivers, so that part can be manipulated in any way. But I'm out of ideas of how to interpret information to get reliability we need.
Right now the only idea is to try out with directional antennas? But I would like to tryout all the options with the current equipment we have.
Also any literature suggestion that deals with active RFID tags is more than welcome, most of books I've found deal with passive tag solutions.
As a top level statement, if you need to track items leaving your site, your RFID technology is probably the wrong one. The technology you have is better suited to the positional tracking tags within a large area - eg a factory floor. Notwithstanding the above, here is my take:
A good approach to active RFID is to break your area down into zones that are tied to your business processes, for example:
Warehouse
Loading bay
Packing
Entry of a tag into a zone represents the start of a new process or perhaps the end of a process the tag is currently in. For example, moving from warehouse to the packing represents assembling a shipment, and movement into the loading bay initiates a shipment.
The crux of many RFID implementations is the installation and configuration of the RFID intrastructure to:
Map tag -> asset (which you have done)
Map tag read -> zone (and by inference asset -> zone)
Map movements between zones to steps in a business processes (and therefore understand when an asset leaves the site, your goal)
There are a number of considerations: the physical characteristics of 433MHz signals, position of antennae, sensitivity of antennae and some tricks that some vendors have. After an optimal site configuration, then you may need to have some processing tricks on the tag reads that will pour in.
Dirty data
Always keep in mind that tag read data is dirty - that RF interference (from unshielded motors, electric wiring, etc), weather conditions and physical manipulation of tags (eg covering with metal) happen all the time.
RSSI's are like stock tickers - there is a lot of random/microeconomic noise on top of broad macroeconomic trends. To interpret movement, compute the linear regression of groups of reads rather then rely on a specific read's RSSI.
If you do see a tag broadcasting with a high RSSI, which then falls to medium then low and then disappears, you really can interpret that as the tag is leaving the range of the receiver. Is that off-site? Well, you need to consider the site's layout (the zones) and the positioning of receivers within the zones.
TriangulationTrilateration
EDIT I had incorrectly used the term 'triangulation'. This refers to determining the position of something by known the angle it subtends from two or three known locations. In RFID, you use the distance and as such it is called 'trilateration'.
In my experience, vendors selling the tag technology you describe have server software that determines the absolute position of the tags using the received RSSI. You should be able to obtain the position of the tag within 1-10m using such software. Determining if the tag is moving off-site is then easy.
To code this yourself:
First, each tag is pinging away when moving. These pings hit the receivers at almost the same time and sent to the server. However the messages can sometimes arrive out of order or interleaved with earlier and later reads from other receivers. To help correlate pings, the ping contains a sequence number. You are looking for tag reads from the same tag, with the same sequence number, received by three (or more) receivers. If more than three, pick the three with the largest RSSI.
The distance is approximated from RSSI. This is not linear and subject to non-trivial random variation. A quick google turns up:
Given three approximate distances from three known points (the receivers' locations), you can then resolve the approximate position of the tag using Trilateration using 3 latitude and longitude points, and 3 distances.
Now you have the absolute position of the tag. You can use these positions to track the absolute movement of the tag.
To make this useful, you should position receivers so that you can reliably detect tags right up to the physical site boundaries. You should then determine a 'geofence' around your site, within receiver range. I would write a business rule that states:
If the last known position of a tag was outside the geofence, and
A tag read from the tag has not been detected in (say) 10s, then
Declare the tag has left the site.
By using the trilateration and geofence, you can focus the business logic on only those tags close to going awol. If you fail to receive your 1.5s ping only a few times from such a tag, it's highly likely that the tag has gone outside your receiver's range, and therefore off-site.
You're already aware that tag reads can sometimes come from reflections. If you have a lot of these, then your trilateration will be pretty poor. So this method works best when there are fairly large open spaces and minimal reflectors.
Some RFID vendors have all this built into their servers - processing this by writing your own code is (clearly) non-trivial.
Zone design using wide-area receivers
Logical design of zones can help the business logic layer. For example, suppose you have two zones (A and B) with two receivers (1 and 2):
A B
+----------+----------+
| | |
| 1 | 2 |
| | |
+----------+----------+
If you get tag reads from the tag at receiver 1, then one at receiver 2, how do you interpret that? Did tag T move into zone B, or just get a read at the extreme range of 2?
If you get a later read at 1, did the tag move back, or did it never move?
A better physical solution is:
A B
+----------+----------+
| | |
| 1 2 3 |
| | |
+----------+----------+
In this approach, a tag moving from A to B would get reads from the following receivers:
1 1 1 2 1 2 2 3 2 2 3 2 3 3 3 3 3
-------> time
From a programming logic point of view, a movement from A -> B has to traverse reads 1 -> 2 -> 3 (even though there is a lot of jitter). It gets even easier to interpret when you combine with RSSI.
Portal design with directional receivers
You can create quite a good portal using two directional receivers (you will need to spend some time configuring the antenna and sensitivity carefully). Mount a receiver well above the door on both sides. Below is a schematic from the side. R1 and R2 are the receivers (and the rough read field is shown), and on the left is a worker pushing an asset through the door:
----> direction of motion
-------------------+----------------
R1 | R2
/ \ | / \
o / \ / \
|-++ / \ / \
|\++ / \ / \
------------------------------------------
You should get a pattern of reads like this:
<nothing> 1 1 1 1 1 12 1 21 2 12 2 1 2 2 2 2 2 <nothing>
-------> time
This indicates a movement from receiver 1 to receiver 2.
"Signposts"
Savi implementations often use "sign posts" to assist with location. The sign post emits beam that illuminates a small area (like a doorway) in a 123KHz beam. The signpost also transmits a unique number identifying itself (left door might be 1, while the right door might be 2). When the tag passes through the beam, it wakes up and re-broadcasts the number. The reader now knows which door the tag passed through.
Watch out for any metal in the surrounding area. 123KHz travels extremely well down rebar in concrete walls, metal fences and rail tracks. We once had tags reporting themselves hundreds of meters from a signpost due to such effects.
With this approach you can implement a portal much like you would for passive.
Simulating signposts
If you don't have the ability to use signposts, then there is a dirty hack:
Stick a passive RFID tag to your active RFID tag
Install a passive RFID reader on each doorway
Passive RFID is actually very good in restricted spaces, so this implementation can work very well. This solution may be the same cost (or cheaper) than with your active RFID vendor.
If you're clever, you can use the EPC GIAI namespace for the passive tag ID and so burn it with the active tag ID. Both active and passive tags would then be identically named.
Physical considerations
433MHz tags have some interesting characteristics. Well-constructed receivers can get a read of tags within about 100m, which is a long way for RFID. In addition, 433MHz wraps itself around obstacles very well, especially metal ones. We could even read tags in the boot (trunk) of a car travelling at 50km/h - the signal propagates from the rubber seal.
When installing a reader to monitor a zone, you need to adjust its location and sensitivity very carefully to maximize the reads from tags within your zone, but also to minimize reads from outside your zone. This might be done in HW or in SW configuration (like dropping all reads below a particular RSSI).
One idea might be to move the receiver away from the area where your tags are exiting as in the layout below (R is the reader):
+-------------------------+-----------+
| Warehouse | Exit |
| . |
| .
| R . R --->
| .
| . |
| | |
+-------------------------+-----------+
It pays to do a RF site survey and spend enough time to properly understand how tags and readers work in an area. Getting the physical installation right is critical.
Other thing to do is to consider physical constrictions such as corridors and doorways and treat them as choke-points - map logical zones to them. Put a reader (with directional receiver tuned to cover the constriction) and lower sensitivity in to cover the constriction.
What no tag-reads actually means
If my experience of RFID has taught me anything, it is that you can get spurious reads at any time, and you need to treat everything with a degree of suspicion. For example, you might have a few seconds of missing reads from a given tag - this can mean anything:
A user accidentally putting a metal tin over the tag
A fork lift truck getting between tag and reader
An RF collision
A momentary network congestion
The battery dying or fading out (remember to check the low-battery flag in tag reads and ensure the business has a process to replace old tags).
Tag destroyed by a pallet being pushed into it
Stollen by someone wanting to resell it for scrap (Not a joke - this actually happened)
Oh yeah, it may be that the tag moved off-site.
If the tag has not been heard of in, say, 5 minutes, odds are that it's off site.
In most business processes that you would use this active tag technology for, a short delay before the system decides the tag is off-site is acceptable.
Conclusions
Site survey: spend time experimenting with readers in different locations. Walk around the site with a tag and see what reads you are actually getting. Use this to:
Logically segment your site into zones and locate receivers to most accurately position tags in zones
It's easier to determine movement between zones using several receivers; if possible, instrument physical constrictions such as doors and corridors as portals. As part of your RFID implementation, you might even want to install new walls or fences to create such constrictions. Consider a passive RFID for portals.
Beware of metal, especially large expanses of it.
You have dirty data. You need to compute linear regressions on the RSSIs to spot trends over short periods; you need to be able to forgive a small number of missing tag reads
Make sure that there are business processes to handle dying batteries and sudden disappearances of tags.
Above all, this problem is best solved by getting the receivers installed in the best locations and configuring them carefully, then getting the software right. Trying to solve a bad site installation with software can cause premature ageing.
Disclosure: I worked 8 years for a major active RFID vendor.
Using directional antennas sounds like it may be a more reliable option, although this obviously depends on the precise layout of your premises.
As far as using your current omnidirectional receivers, there are a couple of options I can think of:
First one, and likely easiest, would be to collect some data on the average 'check-in' times you are seeing for on-site tags, possibly as a function of the number of on-site tags (if the number is likely to change dramatically - as your collision frequency will be related to the number of tags present). You can then analyse this data to see if you can choose a suitable cut-off time, after which you declare that a tag is no longer present.. Obviously exactly what cut-off you choose will depend on the data you see and your willingness to accept false positives - it could also be that any acceptable cut-off time lies outside your 3 minute window (although I suspect that if that is the case then your 3 minute window may not be viable).
Another, more difficult, option (or group of options more like), would be to utilise more historical information about each tag - for instance, look for tags whose signal strength gradually decreases and then disappears, or tags whose check-in time changes drastically, or perhaps utilise multiple receivers and look for patterns between receivers - such as tags which are only seen by one receiver and then disappear, or distinctive patterns of signal strength (indicating bearing) between receivers as tags go off-site.
Obviously the second option is really about looking for patterns, both over time and between receivers, and is likely to be much more labour (and analysis) intensive to implement. If you are able to capture enough good quality data you might be able to utilise machine-learning algorithms to identify relevant patterns.
We do this every day.
First question is: "How many tags do you have at a reader at any given time?". Collisions are more rare than you might think, but they do happen and tag over-population can be easily determined.
Our Software was written and might be using the same readers and tags that you are using. We set reader timeouts to determine when a tag is "away" or "offsite"; usually 30 seconds without the tag being read. Arrival of course is instantaneous when a tag is detected at the reader, then the tag is flagged "onsite".
We also have the option to use multiple readers; one at a gate and another on the parking lot or in the building for example. The gate reader has a short timeout. If a tag passes the gate reader, it is red and then times out very quickly to flag the tag as "offsite". If a tag is then read by any other reader, the tag is then considered "onsite".
I can post links if you think it would be helpful, else you can search for RFID Track. It's iOS App and there are settings posted for a demo server.
Peter
I have a column which is made up of addresses as show below.
Address
1 Reid Street, Manchester, M1 2DF
12 Borough Road, London, E12,2FH
15 Jones Street, Newcastle, Tyne & Wear, NE1 3DN
etc .. etc....
I am wanting to split this into different columns to import into my SQL database. I have been trying to use Findstring to seperate by the comma but am having trouble when some addresses have more "sections" than others. ANy ideas whats the best way to go about this?
Many THanks
This is a requirements specification problem, not an implementation problem. The more you can afford to assume about the format of the addresses, the more detailed parsing you will be able to do; the other side of the same coin is that the less you will assume about the structure of the address, the fewer incorrect parses you will be blamed for.
It is crucial to determine whether you will only need to process UK postal emails, or whether worldwide addresses may occur.
Based on your examples, certain parts of the address seem to be always present, but please check this resource to determine whether they are really required in all UK email addresses.
If you find a match between the depth of parsing that you need, and the assumptions that you can safely make, you should be able to keep parsing by comma indexes (FINDSTRING); determine some components starting from the left, and some starting from the right of the string; and keep all that remains as an unparsed body.
It may also well happen that you will find that your current task is a mission impossible, especially in connection with international postal addresses. This is why most websites and other data collectors require the entry of postal address in an already parsed form by the user.
Excellent points raised by Hanika. Some of your parsing will depend on what your target destination looks like. As an ignorant yank, based on Hanika's link, I'd think your output would look something like
Addressee
Organisation
BuildingName
BuildingAddress
Locality
PostTown
Postcode
BasicsMet (boolean indicating whether minimum criteria for a good address has been met.)
In the US, just because an address could not be properly CASSed doesn't mean it couldn't be delivered - cip, my grandparent-in-laws live in enough small town that specifying their name and city is sufficient for delivery as local postal officials know who they are. For bulk mailings though, their address would not qualify for the bulk mailing rate and would default to first class mailing. I assume a similar scenario exists for UK mail
The general idea is for each row flowing through, you'll want to do your best to parse the data out into those buckets. The optimal solution for getting it "right" is to change the data entry method to validate and capture data into those discrete buckets. Since optimal never happens, it becomes your task to sort through the dross to find your gold.
Whilst you can write some fantastic expressions with FINDSTRING, I'd advise against it in this case as maintenance alone will drive you mad. Instead, add a Script Transformation and build your parsing logic in .NET (vb or c#). There will then be a cycle of running data through your transformation and having someone eyeball the results. If you find a new scenario, you go back and adjust your business rules. It's ugly, it's iterative and it's prone to producing results that a human wouldn't have.
Alternatives to rolling your address standardisation logic
buy it. Eventually your business needs outpace your ability to cope with constantly changing business rules. There are plenty of vendors out there but I'm only familiar with US based ones
upgrade to SQL Server 2012 to use DQS (Data Quality Services). You'll probably still need to buy a product to build out your knowledge base but you could offload the business rule making task to a domain expert ("Hey you, you make peanuts an hour. Make sure all the addresses coming out of this look like addresses" was how they covered this in the beginning of one of my jobs).
I'm writing a Regex to validate email. The only one thing confuse me is:
Is it possible to have single character for top level domain name? (e.g.: lockevn.c)
Background: I knew top level domain name can be from 2 characters to anything (.uk, .us to .canon, .museum). I read some documents but I can't figure out does it allow 1 character or not.
It is technically possible, however, there are no single character tlds that have been accepted into the root (as of the moment) so the answer is:
Yes, it is possible to have single character for top level domain name, however, there are currently no single character TLDs in the root.
You can see the list of TLDs that are currently in the root at this URL:
http://data.iana.org/TLD/tlds-alpha-by-domain.txt
RFC-952 shows what a "name" is, this includes what is valid as a top level domain:
A "name" (Net, Host, Gateway, or Domain name) is a text string up
to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus
sign (-), and period (.).
Additionally, the grammar from RFC-952 shows:
<name> ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]
RFC-1123 section 2.1 specifically allowed single letter domains & subdomains, changing the initial grammar of RFC-952 from starting with just a letter to being more relaxed, so now you are allowed to have single letter top level domains that are a number:
2.1 Host Names and Numbers
The syntax of a legal Internet host name was specified in RFC-952.
One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a
letter or a digit. Host software MUST support this more liberal
syntax.
EDIT: As per #mr.spuratic's comment, RFC-3696 section 2 tightened the rules for top level domains, stating:
There is an additional rule that essentially requires
that top-level domain names not be all-numeric.
This means that:
a. is a valid top level domain
1. is not a valid top level domain
A very unscientific test of this shows that if I add "a" into my hosts file pointing to my local machine, going to http://a in my address bar does show my Apache welcome page.
I'm not sure about the internet standard, but in practice, no.
See,
http://www.norid.no/domenenavnbaser/domreg.html
and,
http://sqa.fyicenter.com/Online_Test_Tools/Domain_Name_Format_Validator.php
You should DEFINITELY allow 1-character domains since some registries allow them not by accident (and I speak of quite big registries like UK, Germany, Poland, Ireland too - so important contributors to the Internet community, not oney exotic small exceptions). Since I also plan using such domains, that definitely work also with all e-mail services I used, letters AND numbers, I really would give the hint to allow this, else your script might need later correction.
Also some of the biggest internet companies use such domains - one of the most famous examples is Twitters t.co for shortening. Other companies I know of who have such domains are Facebook, Google, PayPal, Deutsche Telekom. But the list is longer and also some bigger investors hold them as assets.
By the way as proof there is a website trading this kind of domains online if You search for "1 letter domain names" :)
I am trying to write a company's details parser that can split text like the following into it's constituent parts:
THALES LAND AND JOINT SYSTEMS
Total Signature Management
Wookey Hole Road
Wells
Somerset
BA5 1AA
Tel: +44(0)1749 682384
Fax: +44 (0)1749 682235
The problem I am having is, how I can tell that "Total Signature Management" is not actually part of the address? Normally, a company will display its name "THALES LAND AND JOINT SYSTEM" and line 2 would normally be the first part of the address.
In the case above, the company name is followed by a non address part, is there anyway to tell the difference?
Thanks
You could calculate the probability of Address<->Description based on the occuring words. In this example it's quite obvious: the "road" line is much more likely to be part of an address than the "management" line.
This should work nicely if the non-address part will only appear after the company name. If it's possible that the non-address parts can be found somewhere in the text, it's getting near to impossible to separate them without further information.
Maybe you want to take a look on a similar question I asked yesterday.
Edit: You could create a statistical model based on previous categorized address-parts (the ones you are sure, that they are addresses ;) ).