KCL Statistics webinar, Autumn 2022
If you are interested in attending the webinars, please contact the organiser at: email@example.com
13 October, 14:00 (UK time) Dr Tamara Broderick (Massachusetts Institute of Technology)
Title and Abstract will appear here closer to the event.
Dr Broderick is an Associate Professor at MIT. She works in the areas of machine learning and statistics. Her research focuses on understanding how we can reliably quantify uncertainty and robustness in modern, complex data analysis procedures. She is particularly interested in Bayesian inference and graphical models with an emphasis on scalable, nonparametric, and unsupervised learning.
27 October, 14:00 (UK time) Professor George Michailidis (University of Florida)
Title and Abstract will appear here closer to the event.
Professor Michailidis is Professor of statistics & founding director of the University of Florida Informatics Institute College of Liberal Arts and Sciences. His research focuses on developing methodology for high dimensional data and addressing the corresponding algorithmic and inference issues. Specific research projects include joint estimation of multiple graphical models, vector autoregressive models, pathway enrichments analysis and fast monitoring techniques for high-dimensional streaming data. He has also developed extensive methodology for the analysis of time-evolving networks.
3 November, 14:00 (UK time) Professor Peter Hoff (Duke University)
Title : Core Shrinkage Covariance Estimation for Matrix-variate Data
A separable covariance model for a random matrix provides a parsimonious description of the covariances among the rows and among the columns of the matrix, and permits likelihood-based inference with a very small sample size. However, in many applications the assumption of exact separability is unlikely to be met, and data analysis with a separable model may overlook or misrepresent important dependence patterns in the data. In this article, we propose a compromise between separable and unstructured covariance estimation. We show how the set of covariance matrices may be uniquely parametrized in terms of the set of separable covariance matrices and a complementary set of "core" covariance matrices, where the core of a separable covariance matrix is the identity matrix. This parametrization defines a Kronecker-core decomposition of a covariance matrix. By shrinking the core of the sample covariance matrix with an empirical Bayes procedure, we obtain an estimator that can adapt to the degree of separability of the population covariance matrix.
10 November, 14:00 (UK time) Professor Claire Gormley (University College Dublin)
Title and Abstract will appear here closer to the event.
Professor Gormley is based at the School of Mathematics and Statistics at University College Dublin. Her research focuses on the development of novel statistical methods, largely based on latent variable models, for the analysis of high dimensional data, often of mixed type. These methods solve applied problems across a range of disciplines, including epigenetics, metabolomics, genomics, social science, sports science and political science.
17 November, 14:00 (UK time) Professor Richard Davis (Columbia University)
Title and Abstract will appear here closer to the event.
Professor Davis’s research interests lie primarily in the areas of applied probability, time series, and stochastic processes. His dissertation work focused on extreme values of general stationary processes. While his research interests have gravitated towards problems in time series analysis (inference, estimation, prediction and general properties of time series models), extreme value theory still has a strong influence in his approach to solving problems.
KCL Statistics webinar, Spring 2022
If you are interested in attending the webinars please contact the organiser at: firstname.lastname@example.org
31 March. 14:00 Dr Sandra Fortini (University of Bocconi)
Title: Predictive constructions based on Measure-valued Pólya urn processes.
The central assumption in the Bayesian approach to inductive reasoning is that there exists a random parameter that rules the distribution of the observations. The model is completed by choosing a prior distribution for the parameter, and inference consists in computing the conditional distribution of the parameter, given the sample. A different modelling strategy uses Ionescu-Tulcea theorem to define the law of the observation process from the sequence of predictive distributions. In this talk, we consider a class of predictive constructions based on measure-valued Pólya urn processes. These processes have been introduced in the probabilistic literature as an extension of k-colour urn models, but their implications for Bayesian statistics have yet to be explored.
24 March. 16:00 Professor Stephen Walker (University of Texas at Austin)
Title: Bayesian Learning with Martingales.
A nice result of Doob from the 1940s showed how Bayesian inference can be understood by predictive sampling. In this framework, which does not necessarily need to start with a prior, martingales become the key tool for ensuring convergence of limits of variables which can be treated as samples from a posterior distribution. The practical features of the martingales are that very complex models requiring MCMC, for example, can be sampled directly
and can moreover make use of parallel sampling. Some Bayesian nonparametric problems will be illustrated.
Joined work with Chris Holmes and Edwin Fong (University of Oxford)
The paper has been accepted with discussion by the Journal of the Royal Statistical Society, Series B
You can find the final version of the paper here: https://arxiv.org/abs/2103.15671
3 March. 14:00 Dr Shahin Tavakoli (University of Geneva)
Title: Factor Models for High Dimensional Functional Time Series.
We setup the theoretical foundations for a high-dimensional functional factor model approach in the analysis of large cross-sections (panels) of functional time series (FTS). We first establish a representation result stating that, under mild assumptions on the covariance operator of the cross-section, we can represent each FTS as the sum of a common component driven by scalar factors loaded via functional loadings, and a mildly cross-correlated idiosyncratic component. Our model and theory are developed in a general Hilbert space setting that allows for mixed panels of functional and scalar time series. We then turn to the identification of the number of factors, and the estimation of the factors, their loadings, and the common components. We provide a family of information criteria for identifying the number of factors, and prove their consistency. We provide average error bounds for the estimators of the factors, loadings, and common component; our results encompass the scalar case, for which they reproduce and extend, under weaker conditions, well-established similar results. Under slightly stronger assumptions, we also provide uniform bounds for the estimators of factors, loadings, and common component, thus extending existing scalar results. Our consistency results in the asymptotic regime where the number N of series and the number T of time observations diverge thus extend to the functional context the “blessing of dimensionality” that explains the success of factor models in the analysis of high-dimensional (scalar) time series. We provide numerical illustrations that corroborate the convergence rates predicted by the theory, and provide finer understanding of the interplay between N and T for estimation purposes. We conclude with an application to forecasting mortality curves, where we demonstrate that our approach outperforms existing methods.
Joint work with Gilles Nisol (ULB)
and Marc Hallin (ULB)
24 Feb. 14:00 Professor Nial Friel (University College Dublin)
Title: Assessing competitive balance in English Premier League for over forty seasons using a stochastic block model.
Abstract: Competitive balance is a desirable feature in any professional sports league and encapsulates the notion that there is unpredictability in the outcome of games as opposed to an imbalanced league in which the outcome of some games are more predictable than others, for example, when an apparent strong team plays against a weak team. In this paper, we develop a model-based clustering approach to provide an assessment of the balance between teams in a league. We propose a novel Bayesian model to represent the results of a football season as a dense network with nodes identified by teams and categorical edges representing the outcome of each game. The resulting stochastic block model facilitates the probabilistic clustering of teams to assess whether there are competitive imbalances in a league. A key question then is to assess the uncertainty around the number of clusters or blocks and consequently estimation of the partition or allocation of teams to blocks. To do this, we develop an MCMC algorithm that allows the joint estimation of the number of blocks and the allocation of teams to blocks. We apply our model to each season in the English premier league from 1978/79 to 2019/20. A key finding of this analysis is evidence which suggests a structural change from a reasonably balanced league to a two-tier league which occurred around the early 2000's.
(Joint work with F. Basini, V. Tsouli, and I. Ntzourfras)
27 Jan. 14:00 Conchi Ausin (Universidad Carlos III de Madrid, Spain)
Title: Variational inference for high dimensional structured factor copulas
Abstract: Factor copula models have been recently proposed for describing the joint distribution of a large number of variables in terms of a few common latent factors. A Bayesian procedure is employed in order to make fast inferences for multi-factor and structured factor copulas. To deal with the high dimensional structure, a Variational Inference (VI) algorithm is applied to estimate different specifications of factor copula models. Compared to the Markov Chain Monte Carlo (MCMC) approach, the variational approximation is much faster and could handle a sizeable problem in limited time. Another issue of factor copula models is that the bivariate copula functions connecting the variables are unknown in high dimensions. An automatic procedure is derived to recover the hidden dependence structure. By taking advantage of the posterior modes of the latent variables, the bivariate copula functions are selected by minimizing the Bayesian Information Criterion (BIC). Simulation studies in different contexts show that the procedure of bivariate copula selection could be very accurate in comparison to the true generated copula model. The proposed procedure is illustrated with two high dimensional real data sets.
(Joint work with Hoang Nguyen and Pedro Galeano)
KCL Statistics webinar, Autumn 2021
7 Oct. 14:00 Alessia Pini (Università Cattolica del Sacro Cuore, Italy)
Title: Local inference for functional data
Abstract: A topic which is becoming more and more popular in Functional Data Analysis is local inference, i.e., the continuous statistical testing of a null hypothesis along a domain of interest. The principal issue in this topic is the infinite amount of tested hypotheses, which can be seen as an extreme case of multiple comparisons problem.
A number of quantities have been introduced in the literature of multivariate analysis in relation to the multiple comparisons problem. Arguably the most popular ones are Family-Wise Error Rate (FWER) and False Discovery Rate (FDR). FWER measures the probability of committing at least one type I error along the domain, while FDR that measures the expected proportion of false discoveries (rejected null hypotheses) among all discoveries.
This talk presents the extension of those two concepts to functional data, and propose different methods to adjust pointwise p-values to control the defined quantities.
The proposed methods are applied to human movement data and to world wide temperature data.
21 Oct. 15:00 Christine Anderson-Cook (Los Alamos National Laboratory, US)
Title: Sequential Design of Experiments and Some Innovative Space-Filling Designs
Abstract: In many data collection scenarios, we have choices about whether to run a single large experiment or a sequence of small experiments. This talk describes some of the advantages of collecting data in increments, and using the results from previous stages to drive design choices for later ones. This approach can avoid wasting valuable resources, maximize what can be learned and allow for multiple objectives to be addressed. In addition, several new types of space filling designs will be presented: (1) Non-uniform Space Filling (NUSF) designs allow for some regions of the input space to be emphasized more than others, and (2) Input-Response Space Filling (IRSF) designs create a Pareto front of choices that vary in how much they emphasize good space filling properties for the input space versus the response space.
11 Nov. 14:00 Alan R. Vazquez (University of California, US)
Title: Effective algorithms for constructing two-level QB-optimal designs for screening experiment
Abstract: Optimal two-level screening designs are widely applied in the manufacturing industry to identify factors that explain most of the product variability. These designs feature each factor at two settings and are traditionally constructed using standard algorithms, which rely on a pre-specified linear model. Since the assumed model may depart from the truth, two-level QB-optimal designs have been developed to provide efficient estimates for parameters in a large set of potential models as well. The optimal designs also have an overarching goal that models that are more likely to be the best for explaining the data are estimated more efficiently than the rest. Despite these attractive features, there are no good algorithms to construct these designs. In this talk, we therefore propose two algorithms. Our first algorithm, which is rooted in mixed integer programming, guarantees convergence to the two-level QB-optimal designs. Our second algorithm, which is based on metaheuristics, employs a novel formula to assess these designs and it is computationally efficient. Using numerical experiments, we demonstrate that our mixed integer programming algorithm is attractive to find small optimal designs, and our heuristic algorithm is an effective approach to construct both small and large designs.
This talk is based on joint work with Weng Kee Wong (UCLA) and Peter Goos (KU Leuven).
25 Nov. 14:00 Jakob Macke (University of Tübingen, Germany)
Title: Simulation-based inference for neuroscience (and beyond)
research makes extensive use of mechanistic models of neural dynamics
— these models are often implemented through numerical simulators,
requiring the use of simulation-based approaches to statistical
inference. I will talk about our recent work on developing simulation based inference-methods using flexible density
estimators parameterised with neural networks, our efforts on benchmarking
these approaches, and applications to modelling problems in neuroscience and
Dax, M., Green, S. R., Gair, J., Macke, J. H., Buonanno, A., & Schölkopf, B. (2021). Real-time gravitational-wave science with neural posterior estimation. arXiv preprint arXiv:2106.12594.
2 Dec. 14:00 Kirstin Strokorb (Cardiff University, UK)
Title: Max-stable processes - A guided tour with an emphasis on simulation
Abstract: Being the max-analogue of stable stochastic processes, max-stable processes form one of the fundamental classes of stochastic processes. With the arrival of sufficient computational capabilities, they have become a benchmark in the analysis of spatial extreme events. In this talk I will give an overview over key results and applications and then draw particular attention to their simulation algorithms that are critical for spatial risk assessment. Some new theoretical results allow us to put existing procedures for this task into perspective of one another and deduce some practical dos and don'ts in this context.
Joint work with: Marco Oesting (Stuttgart)
16 Dec. 14:00 Giacomo Zanella (Bocconi University, Italy)
Title: Robust leave-one-out cross-validation for high-dimensional Bayesian models
Abstract: Leave-one-out cross-validation (LOO-CV) is a popular method for estimating out-of-sample predictive accuracy. However, computing LOO-CV criteria can be computationally expensive due to the need to fit the model multiple times. In the Bayesian context, importance sampling provides a possible solution but classical approaches can easily produce estimators whose variance is infinite, making them potentially unreliable. Here we propose and analyze a novel mixture estimator to compute Bayesian LOO-CV criteria. Our method retains the simplicity and computational convenience of classical approaches, while guaranteeing finite variance of the resulting estimators. Both theoretical and numerical results are provided to illustrate the improved robustness and efficiency. The computational benefits are particularly significant in high-dimensional problems, allowing to perform Bayesian LOO-CV for a broader range of models.
Title: Challenges and opportunities in the analysis of clinical data.
Abstract: The increasing availability of clinical measures (e.g., electronic medical records) leads to collecting different types of information. This information includes multiple longitudinal measurements, and sometimes, also survival outcomes. The motivation comes from several clinical applications. In particular, patients after a heart valve replacement have a higher risk of dying or requiring a reoperation. These patients are followed echocardiographically, where several biomarkers are collected. Another example comes from patients after stroke, where measurements to assess recovery are taken over time.
Each outcome of interest is mainly analyzed separately, although it is biologically relevant to study them together. Previous work has focused on investigating the association between longitudinal and survival outcomes; however, less work has been done to examine the association between multiple longitudinal outcomes. In that case, it is common to assume a multivariate normal distribution for the corresponding random effects. This approach, nevertheless, will not measure the strength of association between the outcomes. Including longitudinal outcomes, as time-dependent covariates, in the model of interest would allow us to investigate the strength of the association between different outcomes.
Several challenges arise in both the analysis of multiple longitudinal data and longitudinal-survival data. In particular, different characteristics of the patients' longitudinal profiles could influence the outcome(s) of interest. Using extensions of multivariate mixed-effects models and joint models of longitudinal and survival outcomes, we investigate how different features (underlying value, slope, area under the curve) of the longitudinal predictors are associated with the primary outcome(s).
Using an extensive simulation study, we investigate the impact of misspecifying the association between the outcomes. The results show important bias when not using the appropriate characteristic of the longitudinal profile. In this new era of rich medical data sets, it is often challenging to decide how to analyze all the available data appropriately.
Title: Ordering in metric spaces
Abstract: Statistical data depth orders the elements of a space with respect to a distribution, and, in particular, orders elements in a dataset. The first part of the talk will be introductory, starting in spaces of dimension one and going to multivariate and then functional spaces. Then, we will present a statistical depth function for metric spaces that is the first instance of depth that satisfies the required properties in the definition of functional depth.
25 Feb. 14:00 Eric Schoen (KU Leuven, Belgium)
Title: Order-of-addition experiments to elucidate the sequence effects of treatments
Abstract: The sequence in which a set of treatments is applied may have an effect on the properties of the experimental units after applying all the treatments. For example, the order of adding six components of an automobile coating paint can affect the properties of the coating. In this example, a treatment corresponds with adding a particular component and the experimental unit is the paint. In the absence of theoretical guidance, the optimal sequence should be determined experimentally. However, even for moderate numbers of treatments, the total number of possible sequences can be too large to include all of them in an experiment. Instead, a fraction of the total number is tried out and the optimal sequence is inferred from a statistical model of the results. As a matter of fact, the automobile coating paint was investigated by using just 24 out of the possible 720 sequences.
The statistical model involves so-called pairwise order (PWO) factors that take the value +1 if treatment i is carried out before treatment j and -1 if i is carried out after treatment j. There is one PWO factor for each pair of treatments. An experiment to study the sequence effect of m treatments therefore involves m(m-1)/2 such factors.
In the talk, I will introduce the PWO factor-based models step by step. Then, I will turn to optimal statistical designs for estimating these models. We collected complete sets of optimal 12-run and 24-run designs under the PWO factor model for up to 7 components. These designs can be subjected to further evaluation criteria featuring estimation efficiency for models that include some two-factor interactions of the PWO factors.
Joint work with R. W. Mee, University of Tennessee, Knoxville TN, USA
11 Mar. 14:00 Frederic Ferraty (University of Toulouse - Jean Jaures, France)
Title: Scalar-on-function local linear regression and beyond
Abstract: Regressing a scalar response on a random function (i.e. random variable taking values in function space) is nowadays a common situation. In the nonparametric setting, this paper paves the way for making the local linear regression based on a projection approach a prominent method for solving this regression problem. Our asymptotic results demonstrate that the functional local linear regression outperforms its functional local constant counterpart. Beyond the estimation of the regression operator itself, the local linear regression is also a useful tool for predicting the functional derivative of the regression operator, a promising mathematical object on its own. The local linear estimator of the functional derivative is shown to be consistent. On simulated datasets we illustrate good finite sample properties of both proposed methods. On a real data example of a single-functional index model we indicate how the functional derivative of the regression operator provides an original and fast, widely applicable estimating method.
(Joint work with Stanislav Nagy)
18 Mar. 14:00 Valeria Vitelli (University of Oslo, Norway)
Title: The Bayesian Mallows model from preference learning to rank-based genomic data integration, with some recent advances
Abstract: The use of ranks in genomics is naturally linked to the underlying biological question, since one is often interested in overly-expressed genes in a given pathology. Moreover, ranks are insensitive to heterogeneity in the measurement scales, and more robust to outliers and measurement error. We propose to use a Bayesian Mallows model for ranks, able to both produce estimates of the consensus ranking of the genes shared among samples, and to fill-in missing data information. Interestingly, the model has already been fruitfully applied in other contexts, such as recommender systems, where it has proved to be useful for learning individualised preferences of the users, useful for providing personalized suggestions. Both when used in the context of genomics studies, and in user-oriented applications, posterior distributions of the unknowns are particularly relevant, since these can provide an evaluation of the uncertainty associated to the estimates, and thus of the reliability of the results.
I will present a statistical model, the Bayesian Mallows ranking model, which works well in these situations, and which is able of flexibly handling quite different kinds of data. The Bayesian paradigm allows for a fully probabilistic analysis, and it easily handles missing data and cluster estimation via augmentation procedures. I will briefly review some relevant case studies to show the method's potentialities in the variety of situations in which we applied it, from genomic data integration to recommender systems. I will then conclude with a brief teaser on the most recent advancements and extensions.
6 May 14:00 Nathaniel Stevens (University of Waterloo, Canada)
Title: Design and Analysis of Confirmation Experiments
Abstract: The statistical literature and practitioners have long advocated for the use of confirmation experiments as the final stage of a sequence of designed experiments to verify that the optimal operating conditions identified as part of a response surface methodology strategy are attainable and able to achieve the value of the response desired. However, until recently there has been a gap between this recommendation and details about how to perform an analysis to quantitatively assess if the confirmation runs are adequate. Similarly, there has been little in the way of specific recommendations for the number and nature of the confirmation runs that should be performed. In this talk, we propose analysis methods to assess agreement between the mean response from previous experiments and the confirmation experiment, as well as suggest a strategy for the design of confirmation experiments that more fully explores the region around the optimum.
KCL Statistics webinar, Autumn 2020
If you are interested in attending the webinars please contact the organiser at: email@example.com
15 Oct. 14:00 Eduardo García Portugués (UC3M)
Title: Uniformity tests on the hypersphere via projections
Abstract: Testing uniformity is perhaps the most fundamental problem when dealing with hyperspherical data, a datatype in which directions encode all the relevant information. In particular, testing spherical uniformity has three interesting astronomical applications regarding the study of sunspots, comets, and craters. In this talk, we introduce a projection-based class of uniformity tests on the hypersphere based on the empirical cumulative distribution function. This new class allows the derivation of new tests that neatly extend the circular-only tests by Watson, Ajne, and Rothman to the hypersphere, while also introducing the first instance of an Anderson–Darling-like test for hyperspherical data. Tractable expressions and asymptotics for the test statistics are provided, and the connection of the new class with the Sobolev class is elucidated. A simulation study evaluates the theoretical findings and evidences the competitiveness of the new tests. Applications in astronomy are shown. The talk is based on joint work (arXiv:2008.09897) with Paula Navarro-Esteban and Juan A. Cuesta-Albertos.
29 Oct. 14:00 Peter Goos (KU Leuven)
Title: Orthogonal Minimally Aliased Response Surface Designs
Abstract: Response surface designs are a core component of the response surface methodology, which is widely used in the context of product and process optimization. In this presentation, we present a new class of 3-level response surface designs, which can be viewed as matrices with entries equal to −1, 0 and +1. Because the new designs are orthogonal for the main effects and exhibit no aliasing between the main effects and the second-order effects (two-factor interactions and quadratic effects), we call them orthogonal minimally aliased response surface designs or OMARS designs. We constructed an extensive catalog of OMARS design for commonly used numbers of factors using integer programming. Also, we characterized each design in the catalog extensively in terms of estimation and prediction efficiency, identified interesting designs and investigated trade-offs between the different design evaluation criteria. Finally, we developed a multi-attribute decision algorithm to select designs from the catalog and built OMARS designs with two-level categorical factors as well. In the presentation, we compare the new OMARS designs to benchmark definitive screening designs constructed using commercial software.
12 Nov. 14:00 Jessica Barrett (University of Cambridge)
Title: Modelling longitudinal heteroscedasticity: Within-individual blood pressure variability and the risk of cardiovascular disease
Abstract: I will consider the problem of how to estimate the association between blood pressure variability and cardiovascular disease. In the clinical literature this is typically done by calculating a variability measure, e.g. the standard deviation, from a set of blood pressure measurements per individual, and including this as an explanatory variable in a regression model for the time to the first cardiovascular event. But this leads to regression dilution bias in the estimated association parameter because the variability measure is subject to measurement error. I will discuss instead the use of statistical models for within-individual variability which allow the residual variance to depend on covariates and/or random effects, e.g. mixed effects location scale models (Hedeker et al, An Application of a Mixed‐Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data, Biometrics, (2008), 64(2):627-34).
26 Nov. 14:00 Katie Severn (University of Nottingham)
Title: Manifold valued data analysis of samples of networks, with applications in corpus linguistics
Abstract: Networks can be used to represent many systems such as text documents and brain activity, and it is of interest to develop statistical techniques to compare networks. We develop a general framework for extrinsic statistical analysis of samples of networks, motivated by networks representing text documents in corpus linguistics. We identify networks with their graph Laplacian matrices, for which we define metrics, embeddings, tangent spaces, and a projection from Euclidean space to the space of graph Laplacians. This framework provides a way of computing means, performing principal component analysis and regression, and performing hypothesis tests, such as for testing for equality of means between two samples of networks. We apply the methodology to the set of novels by Jane Austen and Charles Dickens.
3 Dec. 14:00 Timothy I. Cannings (University of Edinburgh)
Title: Adaptive Transfer Learning
Abstract: In transfer learning, we wish to make inference about a target population
when we have access to data both from the distribution itself, and from a different
but related source distribution. We introduce a flexible framework for
transfer learning in the context of binary classification, allowing for covariate dependent relationships between the source and target distributions that are
not required to preserve the Bayes decision boundary. Our main contributions
are to derive the minimax optimal rates of convergence (up to polylogarithmic
factors) in this problem, and show that the optimal rate can be
achieved by an algorithm that adapts to key aspects of the unknown transfer
relationship, as well as the smoothness and tail parameters of our distributional
classes. This optimal rate turns out to have several regimes, depending
on the interplay between the relative sample sizes and the strength of the
transfer relationship, and our algorithm achieves optimality by careful, decision
tree-based calibration of local nearest-neighbour procedures.
This is ongoing work with Henry Reeve (Bristol) and Richard Samworth (Cambridge)
10 Dec. 14:00 Juhyun Park (Lancaster University/ENSIIE)
Title: Sparse functional linear discriminant analysis
Abstract: Functional linear discriminant analysis offers a simple yet efficient method for classification, with the possibility of achieving a perfect classification. Several methods are proposed in the literature that mostly address the dimensionality of the problem. On the other hand, there is a growing interest in interpretability in the analysis, which favours a simple and sparse solution. In this work, we propose a new approach that incorporates a type of sparsity that identifies non-zero sub-domains in the functional setting, offering a solution that is easier to interpret without compromising performance. With the need to embed additional constraints in the solution, we reformulate the functional linear discriminant analysis as a regularization problem with an appropriate penalty. Inspired by the success of $\ell_1$-type regularization at inducing zero coefficients for scalar variables, we develop a new regularization method for functional linear discriminant analysis that incorporates an $L^1$-type penalty, $\int |f|$, to induce zero regions. We demonstrate that our formulation has a well-defined solution that contains zero regions, achieving a functional sparsity in the sense of domain selection. In addition, the misclassification probability of the regularized solution is shown to converge to the Bayes error if the data are Gaussian. Our method does not presume that the underlying function has zero regions in the domain, but produces a sparse estimator that consistently estimates the true function whether or not the latter is sparse. Numerical comparisons with existing methods demonstrate this property in finite samples with both simulated and real data examples.