In a couple of days the Uniq Hyde Air portable wired/wireless charging battery will be hitting the shelves and it’s pretty impressive for such a small footprint.
For all things wired it works like a charm. You’re getting fast charging ~18Watts across the board. That’s 18 total watts, so charging more than one thing you’re going to drop charge pwoer.
As a fast wireless charger it’s… acceptable. The lack of grips on the thing means your phone or device slips off pretty easily. 10 Watt wireless charging also is still in the slow lane and the only reason to use it portable is if you’re in a jam.
It’ll handle pass through charging as well, so you can be charging your power bank and charge multiple devices (including wirelessly,) and then just take your wireless charging bank along with you when you need to go.
I believe this is the biggest bang for the smallest profile we’ve covered, although honestly I’m a little scatterbrained at this point with this metro council run going on. It’s what I’ve been having to use to try and keep my phone running throughout the day and so far it’s never complained.
The only negative, other than lack of grips which may or may not be a deal breaker for you, is this device charges quickly and gets a little warm. This wouldn’t be a problem for me were I not lugging it around in my pocket, but thermally it’s a good couple of degrees warmer than I’m used to.
I don’t think we’re in danger zone, but when you’re walking door to door and you’ve got a warm rear end from fast charging it’s noticeable.
It’s also got a charging kickstand that allows you to prop up a phone and watch videos while it charges wired. If they’d have thought the design through a little bit they could have probably used the HTC EVO 4G kickstand idea and made it so it would prop a phone and wirelessly charge it, but dreams you know….
The Hyde Air (“hi there!”) 10,000mAh wireless USB C portable power bank will be released sometime this week on the manufacturer’s website, and Amazon presumably thereafter.
It’s the most pocketable power bank for the size I think, and coming in at $40 it’s not particularly a bad price for such a lightweight multi-function charging device.
Powered by WPeMatico
- Speech2Face: Learning the Face Behind a Voice — complete with an interesting ethics discussion up front. I wonder where this was intended to go: after all, it can’t perfectly reconstruct faces, so what you get is a stereotype based on the voice. Meh.
- Minivac 601 Replica (Instructables) — Created by information theory pioneer Claude Shannon as an educational toy for teaching digital circuits, the Minivac 601 Digital Computer Kit was billed as an electromechanical digital computer system.
- Nines Are Not Enough: Meaningful Metrics for Clouds — We show that this problem shares some similarities with the challenges of applying statistics to make decisions based on sampled data. We also suggest that defining guarantees in terms of defense against threats, rather than guarantees for application-visible outcomes, can reduce the complexity of these problems.
- Announcing Envoy Mobile (Lyft Engineering) — as Simon Willison said: Lyft’s Envoy proxy / service mesh has been widely adopted across the industry as a server-side component for adding smart routing and observability to the network calls made between services in microservice architectures. “The reality is that three 9s at the server-side edge is meaningless if the user of a mobile application is only able to complete the desired product flows a fraction of the time”—so Lyft is building a C++ embedded library companion to Envoy which is designed to be shipped as part of iOS and Android client applications. “Envoy Mobile in conjunction with Envoy in the data center will provide the ability to reason about the entire distributed system network, not just the server-side portion.” Their decision to release an early working prototype and then conduct ongoing development entirely in the open is interesting, too.
Powered by WPeMatico
- Why Are We So Pessimistic? (Brookings) — The belief or perception that things are much worse than they really are is widespread, and I believe it comes with significant detrimental impacts on societies.
- Perspectives and Approaches in AI Ethics: East Asia — Each country’s perspectives on and approaches to AI and robots on the tool-partner spectrum are evaluated by examining its policy, academic thought, local practices, and popular culture. This analysis places South Korea in the tool range, China in the middle of the spectrum, and Japan in the partner range.
Powered by WPeMatico
“AI starts with ‘good’ data” is a statement that receives wide agreement from data scientists, analysts, and business owners. There has been a significant increase in our ability to build complex AI models for predictions, classifications, and various analytics tasks, and there’s an abundance of (fairly easy-to-use) tools that allow data scientists and analysts to provision complex models within days. As model building become easier, the problem of high-quality data becomes more evident than ever. A recent O’Reilly survey found that those with mature AI practices (as measured by how long they’ve had models in production) cited “Lack of data or data quality issues” as the main bottleneck holding back further adoption of AI technologies.
Even with advances in building robust models, the reality is that noisy data and incomplete data remain the biggest hurdles to effective end-to-end solutions. The problem is even more magnified in the case of structured enterprise data. These data sets are often siloed, incomplete, and extremely sparse. Moreover, the domain knowledge, which often is not encoded in the data (nor fully documented), is an integral part of this data (see this article from Forbes). If you also add scale to the sparsity and the need for domain knowledge, you have the perfect storm of data quality issues.
In this post, we shed some light on various efforts toward generating data for machine learning (ML) models. In general, there are two main lines of work toward that goal: (1) clean the data you have, and (2) generate more data to help train needed models. Both directions have seen new advances in using ML models effectively, building on multiple new results from academia.
Data integration and cleaning
One of the biggest pitfalls in dealing with data quality is to treat all data problems the same. Academic research has been more deliberate in describing the different classes of data quality problems. We see two main classes of problems, which have varying degrees of complexity, and often mandate different approaches and tools to solve them. Since they consume a significant amount of time spent on most data science projects, we highlight these two main classes of data quality problems in this post:
- Data unification and integration
- Error detection and automatic repairing/imputation
Data unification and integration
Even with the rise of open source tools for large-scale ingestion, messaging, queuing, and stream processing, siloed data and data sets trapped behind the bars of various business units is the normal state of affairs in any large enterprise. Data unification or integration refers to the set of activities that bring this data together into one unified data context. Schema matching and mapping, record linkage and deduplication, and various mastering activities are the types of tasks a data integration solution performs. Advances in ML offer a scalable and efficient way to replace legacy top-down, rule-based systems, which often result in massive costs and very low success in today’s big data settings. Bottom-up solutions with human-guided ML pipelines (such as Tamr, Paxata, or Informatica—full disclosure: Ihab Ilyas is co-founder of Tamr) show how to leverage the available rules and human expertise to train scalable integration models that work on thousands of sources and large volumes of data. We discussed some of the challenges and enablers in using ML for this class of problems in an earlier post.
The class of data unification problems has its own characteristics in terms of solution complexity: (1) the problem is often quadratic in the size of the input (since we need to compare everything to everything else), and (2) the main ML task is fairly understood and is mainly determining if two “things” are the same. These characteristics have a considerable impact on the design of the solution. For example, a complex sophisticated model for finding duplicates or matching schema is the least of our worries if we cannot even enumerate all possible pairs that need to be checked. Effective solutions for data unification problems tend to be a serious engineering effort to: (1) prune the space of possible candidates; (2) interact effectively with experts to provide training data and validate the machine decision; and (3) keep rich lineage and provenance to track decisions back for auditing, revising, or reusing for future use cases. Due to the nature of the ML task (mainly Boolean classification here), and the richness of structure, most successful models tend to be the good old “shallow” models, such as random forest, with the help of simple language models (to help with strings data). See this article on data integration status for details.
Error detection, repairing and value imputation
Siloed or integrated data is often noisy, missing, and sometimes even has contradicting facts. Data cleaning is the class of data quality efforts that focuses on spotting and (hopefully) repairing such errors. Like data integration, data cleaning exercises often have been carried out with intensive labor work, or ad-hoc rule-based point solutions. However, this class has different complexities and characteristics that affect the design of the solution: the core ML task is often far more complex than a matching task, and requires building models that understand “how data was generated” and “how errors were introduced” to be able to reverse that process to spot and repair errors.
While data cleaning has long been a research topic in academia, it often has been looked at as a theoretical logic problem. This probably explains why none of the solutions have been adopted in industry. The good news is that researchers from academia recently managed to leverage that large body of work and combine it with the power of scalable statistical inference for data cleaning. The open source HoloClean probabilistic cleaning framework is currently the state-of-the-art system for ML-based automatic error detection and repair. HoloClean adopts the well-known “noisy channel” model to explain how data was generated and how it was “polluted.” It then leverages all known domain knowledge (such as available rules), statistical information in the data, and available trusted sources to build complex data generation and error models. The models are then used to spot errors and suggest the “most probable” values to replace.
Paying attention to scale is a requirement cleaning and integration have in common: building such complex models involves “featurizing” the whole data set via a series of operations—for example, to compute violations of rules, count co-occurrences, or build language models. Hence, an ML cleaning solution would need to be innovative on how to avoid the complexity of these operations. HoloClean, for example uses techniques to prune the domain of database cell and apply judicious relaxations to the underlying model to achieve the required scalability. Older research tools struggled with how to handle the various types of errors, and how to combine the heterogeneous quality input (e.g., business and quality rules, policies, statistical signals in the data, etc.). The HoloClean framework advances the state of the art in two fundamental ways: (1) combining the logical rules and the statistical distribution of the data into one coherent probabilistic model; and (2) scaling the learning and inference process via a series of system and model optimizations, which allowed it to be deployed in census organizations and large commercial enterprises.
Increasing the quality of the available data via either unification or cleaning, or both, is definitely an important and a promising way forward to leverage enterprise data assets. However, the quest for more data is not over, for two main reasons:
- ML models for cleaning and unification often need training data and examples of possible errors or matching records. Depending completely on human labeling for these examples is simply a non-starter; as ML models get more complex and the underlying data sources get larger, the need for more data increases, the scale of which cannot be achieved by human experts.
- Even if we boosted the quality of the available data via unification and cleaning, it still might not be enough to power the even more complex analytics and predictions models (often built as a deep learning model).
An important paradigm for solving both these problems is the concept of data programming. In a nutshell, data programming techniques provide ways to “manufacture” data that we can feed to various learning and predictions tasks (even for ML data quality solutions). In practical terms, “data programming” unifies a class of techniques used for the programmatic creation of training data sets. In this category of tools, frameworks like Snorkel show how to allow developers and data scientists to focus on writing labeling functions to programmatically label data, and then model the noise in the labels to effectively train high-quality models. While using data programming to train high-quality analytics models might be clear, we find it interesting how it is used internally in ML models for the data unification and cleaning we mentioned earlier in this post. For example, tools like Tamr leverage legacy rules written by customers to generate a large amount of (programmatically) labeled data to power its matching ML pipeline. In a recent paper, the HoloClean project showed how to use “data augmentation” to generate many examples of possible errors (from a small seed) to power its automatic error detection model.
The landscape of solutions we presented here for the quest for high-quality data have already been well validated in the market today.
- ML solutions for data unification such as Tamr and Informatica have been deployed at a large number of Fortune-1000 enterprises.
- Automatic data cleaning solutions such as HoloClean already have been deployed by multiple financial services and the census bureaus of various countries.
- As the growing list of Snorkel users suggests, data programming solutions are beginning to change the way data scientists provision ML models.
As we get more mature in understanding the differences between the various problems of integration, cleaning, and automatic data generation, we will see real improvement in handling the valuable data assets in the enterprise.
Machine learning applications rely on three main components: models, data, and compute. A lot of articles are written about new breakthrough models, many of which are created by researchers who publish not only papers, but code written in popular open source libraries. In addition, recent advances in automated machine learning has resulted in many tools that can (partially) automate model selection and hyperparameter tuning. Thus, many cutting-edge models are now available to data scientists. Similarly, cloud platforms have made compute and hardware more accessible to developers.
Models are increasingly becoming commodities. As we noted in the survey results above, the reality is that a lack of high-quality training data remains the main bottleneck in most machine learning projects. We believe that machine learning engineers and data scientists will continue to spend most of their time creating and refining training data. Fortunately, help is on the way: as we’ve described in this post, we are finally beginning to see a class of technologies aimed squarely at the need for quality training data.
Powered by WPeMatico
Amazon’s drone delivery service is in the news again, this time with a new drone named the “MK27.” As the latest iteration of Amazon’s Prime Air delivery drone, it’s apparently safer, more efficient, and more stable than previous models, according to Amazon’s CEO Worldwide Consumer, Jeff Wilke. Wilke also said the company expects to scale the Prime Air delivery drone quickly, with the hope that it will be able to bring packages to customers “within months.”
For some, this may sound like déjà vu, and they’d be right. In December 2013, Amazon CEO and Founder Jeff Bezos told “60 Minutes” that drones would be flying to customers’ homes within five years. But that deadline came and went due to the many regulatory and technical hurdles that drone delivery companies face. Since Amazon first announced its plan for a drone delivery service, the company has gone through more than two dozen drone designs, none of which was able to adequately avoid other aircraft, objects, or people on the ground.
The Federal Aviation Administration (FAA), which regulates commercial drone use in the U.S., has approved previous Amazon drones for test flights, but each new prototype needs a special airworthiness certificate. Notwithstanding, the FAA told Forbes it has approved for Amazon one year of research and testing, allowing the company to operate its new unmanned aircraft for research and development and crew training in authorized flight areas—though not for deliveries.
Still, to Amazon’s credit, they have been hard at work moving drone technology forward for all. This newest drone comes packed with some impressive features, including artificial intelligence (AI) that allows it to operate more autonomously. For example, if the drone’s flight environment changes while it’s in transit or it comes into contact with a moving object while approaching the delivery destination, it will either make an appropriate evasive move, delay delivery, or abort delivery altogether. For most drones, this kind of “sense and avoid” behavior is performed by a remote pilot. In Amazon’s case, the drone employs proprietary computer vision and machine learning algorithms that can detect people, animals, and even wires and know what to do about them as they descend into, and ascend out of, a customer’s yard.
The other “cool factor” of this drone is its hybrid design. As this video shows, it takes off and lands vertically, like a helicopter, and flies horizontally and aerodynamically like an airplane—and it transitions between these two modes. While hybrid aircraft (aka “tail-sitters”) are nothing new, Amazon’s drone avoids the common shortcoming of most hybrid drones like this one by including a vision positioning system to keep it stable in the wind when landing.
When can I enjoy drone delivery?
All said and done, you can scratch that “within months” phrase. Realistically, Amazon is looking at a couple of years for any regular delivery operations—and those would be restricted. Mind you, it will be restricted because the FAA has yet to work out solutions for old and persistent problems like identifying drones, flying them over people, flying them in urban areas, and managing unmanned aircraft traffic, to name a few. Until then, we’re likely to see more headlines. Who knows, perhaps you’ll be one the lucky first recipient to get an emergency package of diapers delivered to your back yard via Amazon Prime Air once they begin testing.
Powered by WPeMatico