Multivariate outlier modeling for capturing customer returns - How simple it can be.

A preemptive approach selects correlated tests to construct multivariate test models to screen out outliers. This approach does not rely on known customer returns. ... This work studies the potential of capturing customer returns with models constructed based on multivariate analysis of parametric wafer sort test measurements. ... Understanding Multivariate Outliers In this PC space, an outlier model can be learned but for simplicity, test limits are set in each PC. ...

doi:10.1109/test.2012.6401547 dblp:conf/itc/SumikawaTWWA12 fatcat:gbuz7syfrjgu7kwcdurbxig5ui

For each of the anomaly instances, ground truth labels for the root cause interval as well as those for the extended effect interval are provided, supporting the development and evaluation of a wide range ... In this paper, we present Exathlon, the first comprehensive public benchmark for explainable anomaly detection over high-dimensional time series data. ... Its AD module uses simple statistical methods like MAD, which is known to be suitable only for detecting simple point outliers [13] . ...

arXiv:2010.05073v3 fatcat:yjlkxps4fzgutax3prww2vrjmq

Multiple Versions

It begins by introducing several important concepts in statistical learning and summarizes different types of learning algorithms. ... Plot (2) shows how the outlier model, when applied, could have captured another return manufactured several months later. ... Plot (1) shows how a return is learned and projected as an outlier in a 3-dimensional test space. ...

doi:10.1145/2593069.2596675 dblp:conf/dac/WangA14 fatcat:f72np5a57nhthb36jc7xgtgwzu

SynC first removes potential outliers in the data and then fits the filtered data with a Gaussian copula model to correctly capture dependencies and marginal distributions of sampled survey data. ... Although it is a fundamental step for many data science tasks, an efficient and standard framework is absent. ... We also provide the list of synthetically generated persons for postal code V3N1P5 in Table 2 , and use this as an example to demonstrate how probabilistic matching can be done on the first customer. ...

arXiv:1904.07998v2 fatcat:5uktjhdcwnbepfauwivroyzq6a

Multiple Versions

This would allow them to be assessed for their availability to provide demand services to the grid. ... An objective of AMR deployment is to clarify the nature and variability of the residential LV customer. ... not captured by this model which would have implications on how well this customer would serve a demand reduction program or a load shifting one. ...

doi:10.1109/tsg.2013.2286698 fatcat:va37dx4ykzbm3iswxjjapogqzy

However, it is not clear how a model trained on a sample compares with one trained on the entire dataset. ... To overcome this issue, outlier detection methods can be trained over samples of the full-sized dataset. ... to sampling, then an effective model can be built over the sample instead of the entire dataset. ...

arXiv:1907.13276v1 fatcat:uhsgk5lbozfjdehdf2lvfnyagm

and outliers. ... Unfortunately, maintaining multiple secondary indexes in the same database can be extremely space consuming, causing significant performance degradation due to the potential exhaustion of memory space. ... As opposed to learned indexes, Hermit models the correlation between two columns and leverages the curve-fitting technique to adaptively create simple yet customized ML models for different regions (TRS-Tree ...

arXiv:1903.11203v2 fatcat:dc2p2yzdo5cplkkhkq5g2bvpka

Multiple Versions

This work presents three pattern mining methodologies for inter-wafer abnormality analysis. ... Given a wafer of interest, the second methodology searches for a test perspective that reveals the abnormality of the wafer. ... customer return wafer Fig. 3 . 3 Multivariate outlier model for the customer returns Fig. 4 . 4 Is w1 more similar to w2 or to w3? ...

doi:10.1109/test.2013.6651890 dblp:conf/itc/SumikawaWA13 fatcat:cxkvbiexlzaoflj3tptzpbg22u

Those OOD samples can lead to unexpected outputs, as dialects of those samples are unseen during model training. ... We utilize the latent embeddings from all intermediate layers of a wav2vec 2.0 transformer-based dialect classifier model for multi-task learning. ... It can also be used to identify and learn new dialects for the system. ...

arXiv:2308.04886v1 fatcat:kz2gqi7pvzha7b3jpkvdif7lki

These resources can be configured on the fly to provide the hardware and operating system of choice to the customer on a large scale. ... While the current target market for these resources in the commercial space is web development/hosting, this model has the lure of savings of ownership, operation, and maintenance costs, and thus sounds ... thus returning the application to the place it began. ...

doi:10.1109/ipdps.2009.5161234 dblp:conf/ipps/BrandtGMPRTW09 fatcat:phrckpu67zaknd4nhuadgwsc5a

Hence, it provides support to localize the reason of the anomaly. The proposed approach is model-based; it relies on the multivariate probability distribution associated with the observations. ... With the proposed procedure, many such hidden issues can be isolated and indicated to the network operator. ... It states that every multivariate joint distribution can be fully reconstructed based on its marginals and its copula. The density function f x 1 , ... ...

arXiv:1912.02166v1 fatcat:lgjvq7xwtvgthj55pk5qpkalma

For example, trends and outliers likewise are supposed to be rare occurrences. In this paper, we discuss the close relationship of these tasks. ... Many established outlier detection methods are designed to search for low-density objects in a static data set of vectors in Euclidean space. ... SigniTrend uses the most complex model, consisting of exponentially weighted moving average and standard deviation, and it thus can also capture variance. ...

doi:10.1109/icdmw.2015.79 dblp:conf/icdm/SchubertWZ15 fatcat:xntevut4pbhw5e4c3pcpenhcdy

For eCommerce websites, there has been a limited understanding of how to measure performance even though it has been researched in many ways and in various contexts over the past decade. ... The model is useful as a tool for benchmarking the performance of the website as well as a foundation for operationalising performance. ... Acknowledgements The authors would like to thank the reviewers for their insightful reviews Auger (2005) Pujani and Xu (2005) ...

doi:10.17705/1pais.03101 fatcat:2wkj4mybkrhtxh4lskfcpa77cu

Szczepanski

This paper describes our deployment of data mining techniques during final test to predict system level test failures and customer returns for two recent mixed-signal system-on-chip products. ... Emphasis is put on practical considerations for simplifying test flow implementation while still meeting the twin goals of reduced test cost and improved product quality. ... Acknowledgements We thank Steven Wei of MediaTek for guiding and supporting our collaboration and ChingCheng Wang of MediaTek for collecting product test data and validating test selections. ...

doi:10.1109/test.2013.6651892 dblp:conf/itc/ChenHYS13 fatcat:dm2i6pt6grdp5ou25yry5n3iyi

in the sense that (a) it is not only the quality o f the predictions that matter but also the structure of the learned model itself, and (b) the knowledge captured by the learned model must be evaluated ... how m uch d a t a is needed for training? will a random sample of 100k customers from a database of 10 million be \good enough"? ...

dblp:conf/dmkd/Smyth01 fatcat:fqcynanxtbfbnmgnvgwmlhiona

Screening customer returns with multivariate test analysis

Preserved Fulltext

Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series [article]

Preserved Fulltext

Other Versions

Data Mining In EDA - Basic Principles, Promises, and Constraints

Preserved Fulltext

SynC: A Unified Framework for Generating Synthetic Population with Gaussian Copula [article]

Preserved Fulltext

Other Versions

Online AMR Domestic Load Profile Characteristic Change Monitor to Support Ancillary Demand Services

Preserved Fulltext

Are Outlier Detection Methods Resilient to Sampling? [article]

Preserved Fulltext

Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations (Extended Version) [article]

Preserved Fulltext

Other Versions

A pattern mining framework for inter-wafer abnormality analysis

Preserved Fulltext

Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance [article]

Preserved Fulltext

Resource monitoring and management with OVIS to enable HPC in cloud computing environments

Preserved Fulltext

Copula-based anomaly scoring and localization for large-scale, high-dimensional continuous data [article]

Preserved Fulltext

Outlier Detection and Trend Detection: Two Sides of the Same Coin

Preserved Fulltext

Measuring the Performance of eCommerce Websites– An Owner 's Perspective

Preserved Fulltext

Predicting system-level test and in-field customer failures using data mining

Preserved Fulltext

Breaking out of the Black-Box: Research Challenges in Data Mining

Preserved Fulltext