Telco operators are constantly looking for more, better and quicker ways to exploit their networks’ user-generated data, both to optimize their own activities and to create new business models and income.
Geolocation provides a highly effective way of improving customer insight and developing new income alongside the core business.
For operators, network coverage is always a key selling point; to the carrier this is referred to as POP (Percentage of Population). Users are eager to pay a few dollars more to get the best coverage wherever they live or work. Although fixed broadband at home is becoming common place in the developed world, the way to achieve it can differ depending of the country or even the specific geography of a region.
In a world of increasing threats, the preventive and responsive capabilities of national security bodies regarding manmade disasters are challenged daily. Government agencies increasingly collaborate with private industries in the development of innovative technologies to better protect the populations, while taking into account local privacy regulations.
For decades, Mobile Network infrastructures have clearly been instrumental in lawful investigations. And yet, only a fraction of mobile network potential has been exploited. In the era of big data, telecommunications networks constitute a strategic asset in the fight against current threats. Indeed, an ocean of data from several million individuals flows through these virtual pipes daily. To comply with increasing intelligence requirements, new investigation tools are required, leveraging the inherent benefits of big data technologies.
First, this document highlights the possibilities and limitations offered by 3GPP standards on geolocation. Then it addresses the interest of passive collection, an alternative technology, to complement these standards. Last, it illustrates examples of innovative security use cases that can be unlocked by such technologies.
Since their inception, subscriber geolocation is one of the many functions defined in Mobile Networks Standards, along with providing their core services such as call / SMS. These standards define dedicated signaling and processes in the different network nodes. This approach is called active geolocation as it triggers specific activities on the network to retrieve a given subscriber location.
As these technologies are standard, they have been widely leveraged by law enforcement forces namely: Upon request by legal authorities, mobile network teams use these techniques to track location of a list of identified suspects.
Such technologies present outstanding features. Most important one being the fact that they are entirely device-agnostic and unspoofable which in turns provides highly reliable information to act on. They also offer several modus operandi, favouring either instant refresh of subscriber’s location or undetectability of the location retrieval. Their basic accuracy is a network cell, which ranges typically from 100 meters in dense urban areas to several kilometers in rural ones. But this accuracy can be improved through radio-triangulation techniques (E-CID, A-GPS, OTDOA…).
These options do imply a price increase given the need of further network equipments along with features to be activated. And the overall end results do not always meet the initial expectations (e.g: you can’t triangulate if the area is covered by only one cell for example).
However, we also need to bear in mind the drawbacks of standard-based active geolocation:
As each tracking requires additional network activity, the volume of geolocation tracking is limited. First, because operational expenses are directly linked to the number of location queries. Second, because too many tracking requests would quickly overwhelm mobile network capacity.
Criminals are well aware of these capabilities and are used to frequently swapping their SIM card and mobile handset in order to evade any surveillance. Another technique is to keep the phone switched off unless necessary. In such cases, active queries will most probably fail to geolocate the subscribers.
Several blind spots
In theory, standards allow geolocation of inbound roamers. But such an option requires operators to establish bilateral interconnections between their geolocation equipments. In real life, this architecture has not been deployed and thus operators cannot geolocate visitors on their networks. Another blind spot: it’s not possible to track subscribers’ location based on their device identifier (IMEI).
No exploratory investigation
The whole system assumes that the targets (and their mobile number) are already identified and under investigation. Active geolocation offers no option to leverage historical data to explore and detect suspicious behaviors.
In summary, active geolocation presents undeniable benefits but also severe limitations. As it only leverages a fraction of all relevant information flowing through mobile networks, exploring complementary approaches is required to improve its efficiency.
In order to make your mobile phone ring when someone tries to reach you, the mobile networks needs to know where you are at all times. Therefore, for the sake of this very basic functions, network signaling natively carries location information.
Network signaling natively carries location information
The idea of passive geolocation is to constantly listen and extract this information. It’s denominated “passive” because contrary to active geolocation it does not generate any specific network activity. As a consequence, it is possible to continuously track the location of each and every connected mobile, with absolutely no impact on network load.
There are many events that carry location information: events originated from subscriber’s activity - sending/receiving SMS/calls, surfing the web, receiving e-mails, app synchronization with the network, etc.- but also events originated from subscriber’s mobility management such as switching the device on/off, handover from one location area to another,etc. The high frequency of such events ensures a very fine tracking of all subscribers’ locations at any point in time. It is handset agnostic, unspoofable and undetectable by the targeted individual.
Collecting events with passive location
Passive collection works on traditional mobile networks (2G, 3G or 4G) but can be extended to other types of networks. A typical example is Wi-Fi networks, used by operators to off-load their cellular networks through data or voice over Wi-Fi services. Such extensions provide better coverage (indoor) and spatial accuracy (smaller cells) for geolocation.
Make the most of the limited active geolocation capabilities.
Why would you trigger a costly active request if you know where he was three minutes ago (through passive collection)? Other example: it may be interesting to get location updates only when a target is close to a specific area of interest, using passive collection in any other areas.
Inbound roamers are natively geolocated through passive collection, as they generate activity on the network like any other subscriber. Furthermore such passive collection can be used to unlock active geolocation queries on inbound roamers, without the need to query foreign networks.
Passive collection enables tracking of individuals either on their SIM card, on their mobile number or on their terminal identifier (IMEI). This possibility allows to track individuals even without knowing their mobile number, based on their device. Additionally, switching SIM cards is not enough to evade surveillance: suspects can no longer simply disappear.
Less blind spots
In addition to inbound roamers and IMEI tracking, passive collection ensures no mobile activity goes undetected. As soon as a mobile connects to the network, even for a very short amount of time, this event is captured and the location is updated. As an additional benefit, the last known location (of a missing person for example) is always available.
Active vs passive location
To summarize, combining both geolocation techniques optimizes network resources, reduces classical blind spots and provides new tracking capabilities.
Storage of the continuous stream of geolocated events unlocks further possibilities of investigations, using additional analytics software.
Applications are countless:
Suspect identification: suspicious targets can be identified through their physical presence on past key incidents, their frequent proximity to other persons of interest and mutual communications. These analyses can be enriched with different filters (e.g. targets having transited through, or are communicating with, a specific country). They aim at reconstructing social circles gravitating around criminal organizations. This process can be fine-tuned and confirmed through the detection of suspicious patterns such as frequent swapping of SIM and/or handsets.
Forensic judicial reconstitution: historical records can be used to build retrospects of the location events over the course of any given period. Given the update frequency of location events, individuals’ paths can be easily reconstructed and visualized.
The protection of sensitive areas (international borders, official buildings, military facilities, hazardous waste repositories…) requires a constant surveillance and passive collection can play a significant role in the detection of trespassers.
Sensitive areas to be protected efficiently
This feature, known as “geofencing”, triggers instant notification whenever someone enters (or leaves) a predefined perimeter.
Whereas classical geolocation systems limit geofencing capabilities to a list of identified suspects, those using passive collection enables notification on entry, based on filtering tools to differentiate authorised personnel from suspicious intruders. Such instantaneous notifications can feed applications with real-time dataflows for contextual analysis.
In case of specific threat / emergency, public authorities can monitor in real-time who is present in a given area. Thresholds can be set to notify authorities whenever abnormal attendance in an area is detected.
Usually the Location-Based Services rely on the Cell-Id location of the customers, leading to an average of 200-500m precision in urban environment, and from 1 to 10 km in rural areas.
Such accuracy is relevant for many use cases: for geotargeted advertising for example, it would be considered as too intrusive to text a customer exactly in front of the corresponding Point of Sales. But there are cases where more precision is needed and where new technologies can reach sub-cell accuracy.
While cell-accuracy can be perfectly relevant to calculate the attendance in a mall or a stadium, it is clearly inadequate when it comes to smaller venues, or when computing the audience around a billboard. Real Estate studies are also highly demanding in terms of location measurements. In transport planning applications, cell-id can only account for quick means of transportation (car, train and plane) where cell-id often changes during the journey. It is insufficiently accurate to study bicycle or pedestrian paths, to distinguish the traffic from two neighboring parallel roads or to avoid false positives in security use cases. Even in location-based advertising, it can be critical to avoid the confusion between the customer entering in a mall and the one who is driving along the highway nearby!
“Triangulating” the position of a mobile requires to access detailed information about the radio access network surrounding a mobile device. This information circulates within the OSS (Operation Support System) and SMLC (Serving Mobile Location Center) actively fetch this information to locate precisely a mobile. This method works fine for individual tracking but because it generates additional signaling in the network, it cannot collect all locations of a given area.
Probes systems are also a mine of technical information. However, because these equipment were primarily designed to monitor the quality of service in the access network they are not well adapted to tracking individual moves: locations are provided with delay, lack regular updates and imply to much investment compared to the expected benefits derived from Location Based Services.
Intersec has developed an original approach, extracting just the necessary information from the RAN (Radio Access Network). As this passive collection is achieved from a centralized equipment with limited scope, it is way more affordable and resilient to any network upgrade.
Such information is mixed with all networks available (including Wi-Fi) to limit the number of blind spots for any individual device.
This Passive SubCell technology offers new perspectives to the service providers: more precise studies in rural areas, on-line visualization of crowd densities and audience real-time management for billboard advertisers to name a few. It is currently used by several of our customers and the results are amazingly positive.
The limitations inherent to 3GPP-standard geolocation bridle applications with security purposes. We’ve seen that new technologies, mixing passive collection and Big/Fast data can be valuable complements to current geolocation standards, making up for their shortfalls and enabling new security use cases. Adding Passive SubCell technology is also a way to deal with rural areas where the size of cell prevents any Location-Based insight.
This paper has only mentioned the software analytical performance and data science capabilities required to transform raw data into contextual insights. Indeed, the development of a library of location-specific algorithms is the following step to industrialize forensic analyses and the operation of these use cases. Such algorithms can vary from the identification of origin and destination of suspects, to routes and conveyance means taken, to the counting of recurring visits in specific areas.
The technologies described enable a plethora of other use cases of public interest that cannot be detailed in such a condensed document. In case of a major event, these can enable authorities to locate Wi-Fi emergency calls and analyze population movements in real-time. Thus, improving the dissemination of factual guidelines to emergency services and local communities while adapting resources accordingly. Equipped with the software technologies described, national security and law enforcement agencies can significantly enrich intelligence and optimize responsiveness to better protect and serve the population.
Migration from analog to digital products and services requires handling and extracting insights from an unprecedented volume of data.
Leading companies worldwide are redesigning their infrastructure and organization to adapt their business to the digitization wave. Indeed, the migration from analog to digital products and services requires handling and extracting insights from an unprecedented volume of data.
In the last couple of years, the advent of open-source technologies fueled big data initiatives with the intent to materialize new business models. Although, there is consensus around the potential business opportunity behind this megatrend, there is not on how to seize value from it. As a software vendor, we believe that technology is a means and not an end.
In the next pages, we will highlight a set of best practices (questions to ask yourself, tips and pieces of advice) to avoid being trapped in the ever-lasting big data project that fails to generate any revenue. Our experience comes from the telecoms industry – one of the most data intensive environments – but the addressed concepts can be transposed to other verticals.
The current digitization wave provides a new space for business development. In fact, the phenomenon materializes differently whether traditional or digital native players are considered:
For traditional players, it allows them to:
For digital native players, it offers capabilities to develop new business models:
Cards, sensors and smart devices, enable the collection and transmission (through GSM, ATM, RFID, etc.) of data records from daily events. So far, the technical capabilities available have not allowed processing such a variety and volume of data.
The current evolution in ICT allows businesses to store and analyze data to extract valuable insights. These have the primary objective of enhancing a company’s activities in order to provide better products and services.
There is no wonder why mobile operators are all interested in Big Data: the huge amount of data they collect from networks and IT opens the door to fascinating usages.
Big data software deployments drive both revenue generation and cost reduction. On the revenue side, they can lead to internal or external monetization. And on the cost-saving side, they impact marketing, network, fraud management and customer experience.
All these fascinating use cases raise many questions from a technical, financial and organizational standpoint. Let’s go through some of the common questions and challenges operators are facing when starting a Big Data project.
The first challenge one can think of is technical: collecting such a wide variety of data from so many different sources can look terribly complex. Have you ever considered mixing very different siloes of data such as customer centric databases, transaction detail records, network information, and data from external website, partners, etc.?
As a matter of fact, our experience shows this technical question is not the hardest one to solve. There are many ways to ingest massive streams of data into a single data entry. The great challenge is elsewhere and deals with more trivial questions.
- Considering the data, most of them are already collected by the telecom operator. However, these collection points were designed for a specific operational purpose and not meant to be loaded with heavy queries. So should operators rely on their existing systems and bear with slow response time? Or should they rather shortcut them and build parallel collection directly from the sources?
- Another non-technical issue would be that many information is made redundant in an operator’s IT. For example, the customers’ ARPU (Average Revenue Per User) are calculated in many different systems with different rules, according to the indicator’s end use. Therefore, it is not so much about finding where the data is, rather than choosing which of the different sources will be taken as a reference. Whatever the choice will be, operators need to bear in mind that it will never be a 100% consistent with other KPIs already in place across the company.
It is possible to successfully build big data infrastructures from scratch. Companies like Google or Yahoo for instance have chosen to use Hadoop. However, even though Hadoop has matured over the past years, our experience is that successful Hadoop implementations put into full production using do-it-yourself infrastructure components always require more time and effort than initially expected.
Of course, it is not expensive (low infrastructure cost) and it can be downloaded for free with easy access. But the other side of the coin is that it requires time, resources, and expertise to build up a do-it-yourself Hadoop infrastructure big data project. The question of skills is all the more important as industry reports shortages of skilled staff in all links of the chain (IT, BI and analytics, database administration…).
In 2016, despite an apparent growing talent pool, companies have to offer bigger salaries in order to attract Hadoop skilled talents and have to go all-out to retain them. So here we are, demand for developers and infrastructure specialists with open source skills is continuing to grow, according to the 2016 Open Source Jobs Report, and organizations are still struggling to fill positions.
If building your own solution feels safer because you master your environment, this choice usually comes with negative impacts, as it involves a higher time-to-market and hidden costs that can counterbalance the benefits. This is such a difficult choice to make that some of our customers started with an internal project and ended up with off-the-shelf solutions. We believe this is one of the cleverest ways to go.
Most of the time, examples of Big Data refer to data lakes: storing Peta bytes of information on the fly, and, from time-to-time, running a powerful algorithm on these huge data bases to get a clear picture on segmentation, risks, scores, etc. This approach is adapted to studies that do not require an immediate trigger for action.
However, there is another approach consisting not only in storing but also in analyzing the data while it’s still flowing, to detect patterns. This Fast Data approach is particularly relevant when timing is of the essence, for example to detect fraud, to grant authorization, to notify a customer that he has reached a certain threshold of usage or a credit limit, etc.
These applications require real-time patterns detection but also the capacity to trigger a set of relevant actions on-the-fly. Different technologies can be used depending on the applications the company wishes to set up.
There are two ways in considering a Big Data project. Either start small and launch the project based on one or two profitable use cases, or consider it a must have for your company and build large storage capabilities, designed to cope with unknown future use cases.
Of course usually operators opt for a mix between these extremes, but it is interesting to examine the consequences of one choice or the other:
When operators opt for a use case approach, they are driven by a clear goal and usually their commitment to deliver is stronger. By definition, they are less challenged on the ROI question afterwards.
However, it can be challenging to find the relevant use case because the business units can have different priorities or because they wish to address their use case on a specific platform. Or simply because the business units do not share the same enthusiasm about the business opportunity than the Big Data project manager. In any case, our experience is that any business plan has limits and the future is never as it was planned to be. So our recommendation is to ensure both the technical solution and the organization will be compatible with many different usages. Far beyond what was initially planned. />
The other approach consists in storing everything and bet that new usages will emerge from this lake of raw information. Usually this approach provides larger degrees of freedom to try innovative use cases. And it can give a bit more time to freely explore the data before deciding which next use cases will be rolled out.
Experience has taught us that the lack of commitment on ROI can slow down the decision to launch the project or to make clear trade-off between multiple choices. The risk is to have fuzzy objectives that may change in time.
There is nothing wrong about this approach, but it demands strong buy-in from business lines and a clear governance model to make sure the project keeps on track.
What about the organization of a big data project? Should the project be carried out by the IT department or should it be led by a dedicated organization, under a new function like a Chief Data Officer, distinct from traditional IT? Of course there is no magical recipe for success.
In our experience, whenever IT department leads the project, they benefit from a natural legitimacy in terms of technical choices. They also have a larger flexibility on resources, because a Big Data project requires significantly less investment than a core IT system. They may also experience a lower pressure on ROI or new use cases delivery.
On the other hand, dedicating a special team to Big Data projects can be an interesting alternative, because it comes with a clear objective and is usually driven by a P&L. Mixing different competencies within the same organization can impulse an innovative mindset, with clear incentive to deliver results. The drawback is that a dedicated team may struggle to convince other departments – either IT or business lines- to use a specific platform rather than alternative solutions… So usually this approach is faster to deliver the first results, but can lose its momentum after a certain period of time.
To summarize, we have identified several success factors listed hereafter:
Overall, be nimble. Stay flexible, in order to adapt and learn as quickly as technologies evolve. If a myriad of opportunities are available, it is important to keep in mind that you cannot explore all at the same time. Technologies are a means to unlock, develop and scale business. Knowing where you are now and want to go, while being down to earth is the path to success.