List of Publications
The conference and journal versions of the same paper are aggregated into a single entry. The local links in each entry point to the most up-to-date version of a paper. You can hide the abstracts if you like.
Published
-
A Distributed Algorithm for Large-Scale Generalized Matching
VLDB 2013
Generalized matching problems arise in a number of applications, including computational advertising, recommender systems, and trade markets. Consider, for example, the problem of recommending multimedia items (e.g., DVDs) to users such that (1) users are recom- mended items that they are likely to be interested in, (2) every user gets neither too few nor too many recommendations, and (3) only items available in stock are recommended to users. State-of-the-art matching algorithms fail at coping with large real-world instances, which may involve millions of users and items. We propose the first distributed algorithm for computing near-optimal solutions to large-scale generalized matching problems like the one above.
-
Instance-Sensitive Robustness Guarantees for Sequencing with Unknown Packing and Covering Constraints
ITCS 2013
Sequencing problems with an unknown covering or packing constraint appear in various applications, e.g., in real-time computing environments with uncertain run-time availability. A sequence is called \alpha-robust when, for any possible constraint, the maximal or minimal prefix of the sequence that satisfies the constraint is at most a factor \alpha from an optimal packing or covering.
In this work we address the fact that many problem instances may allow for a much better robustness guarantee than the pathological worst case instances. We aim for more meaningful, instance-sensitive performance guarantees. We present an algorithm that constructs for each instance a solution with a robustness factor arbitrarily close to optimal.
-
Enabling mobile distributed social networking on smartphones
MSWiM 2012
Distributed social networking services show promise to solve data ownership and privacy problems associated with centralized approaches. Smartphones could be used for hosting and sharing users data in a distributed manner, if the associated high communication costs and battery usage issues could be mitigated. We propose MobiTribe, a novel network-aware distributed content sharing system for distributed social networking, which addresses these challenges by taking advantage of network availability patterns to provide continuous availability of content over low-cost network connections. MobiTribe groups devices into tribes among intended content consumers, which replicate content and serve it using low-cost network connections by exploiting time elasticity of user generated content dissemination. We develop a grouping algorithm based on a combination of a bipartite b-matching and a greedy heuristic algorithm which groups mobile devices into tribes in polynomial time.
-
Optimization Problems in Dotted Interval Graphs
WG 2012
The class of D-dotted interval (D-DI) graphs is the class of intersection graphs of arithmetic progressions with jump at most D. The class generalizes intervals (D-1). We consider various classical graph-theoretic optimization problem when restricted to D-DI graphs. Our algorithms are based on classical results in algorithmic graph theory and new structural properties of D-DI graphs that may be of independent interest.
-
On Tree-Constrained Matchings and Generalizations
ICALP 2011
and
Algorithmica
We study a generalization of maximum weight bipartite matching, where we are given in addition posets over each side of the bipartition and we add the additional requirement that the matched vertices on each side form an independent set in the corresponding poset. We give approximation algorithms and hardness for the problem.
-
Inferring Physical Protein Contacts from Large-Scale Purification Data of Protein Complexes
Molecular and Cellular Proteomics
Recently published large-scale data sets of tandem-affinity protein purifications have allowed unprecedented insights into the organization of cellular protein complexes. Several computational methods have been developed to detect co-complexing proteins in these data, aiding in the identification of biologically relevant protein complexes. While it is now possible to identify key members of protein complexes in a high-throughput fashion, much less is known about the protein-connecting network of physical contacts within these purifications that determines many aspects of complex formation, dynamics, and disease etiology.
We introduce a novel approach for predicting physical contacts in protein purifications and compare its performance to four established methods for scoring co-complexed protein pairs based on several high-confidence experimental reference sets.
-
Bonsai: Growing Interesting Small Trees
ICDM 2010
In this paper, we address the problem of finding such cardinality constrained subgraphs from large node-weighted graphs that maximize the sum of weights of selected nodes. We provide an efficient constant-factor approximation algorithm for this strongly NP-hard problem. A key feature of our approach is that we can simultaneously compute heavy subgraphs for a range of cardinality constraints, thus making it particularly suited for browsing operations during visual analytics. We show how our techniques can be applied in a variety of application settings such as discovering functional modules (most deviant subnetworks) in protein-protein interaction graphs, identifying active cores of topical subgraphs of the Wikipedia graph, detection of communities in social networks, etc.
-
When LP is the Cure for Your Matching Woes: Improved Bounds
for Stochastic Matchings
ESA 2010 (Best paper award) and Algorithmica (special issue)
In this paper we study stochastic matching problems that are motivated by applications in online dating and kidney exchange programs. Our main result is an algorithm that finds a matching-probing strategy having a small constant approximation ratio. An interesting aspect of our approach is that we compare the cost our solution to the optimal edge-probing strategy. Thus, we indirectly show that the best matching-probing strategy is only a constant factor away from the best edge-probing strategy.
(This is the merger of two papers: arxiv:1003.0167 and arxiv:1002.3763.)
-
The checkpoint problem
APPROX
2010
and
Theoretical
Computer Science
In this paper, we consider the checkpoint problem in which given an undirected graph G, a set of source-destinations pairs and a set of fixed paths P between them, the goal is to find a set of checkpoint edges E' which disconnect each pair and minimize the average or minimize the maximum intersection with each path in P. This problem has several natural applications, e.g., in urban transportation and network security, and in a sense combines the multicut problem and the minimum membership set cover problem. We give approximation algorithms and hardness of approximation for different variants of the problem.
-
Universal sequencing on a single machine
IPCO
2010 and SIAM Journal on Computing
We consider scheduling on an unreliable machine that may experience unexpected changes in processing speed or even full breakdowns. We aim for a universal solution that performs well without adaptation for any possible machine behavior. For the objective of minimizing the total weighted completion time, we design a polynomial time deterministic algorithm that finds a universal scheduling sequence with a solution value within 4 times the value of an optimal clairvoyant algorithm that knows the disruptions in advance. A randomized version of this algorithm attains a ratio of e. We we that both results are best possible among all universal solutions. We also consider the variant where jobs have release dates.
-
Assigning Papers to Referees
Algorithmica (special issue)
We study the problem of assigning papers to referees. We propose to optimize a number of criteria that aim at achieving fairness among referees/papers. Some of these variants can be solved optimally in polynomial time, while others are NP-hard, in which case we design approximation algorithms. Experimental results strongly suggest that the assignments computed by our algorithms are considerably better than those computed by popular conference management software.
-
Improved approximation guarantees for weighted matching in the semi-streaming model
STACS 2010
and
SIAM Journal on Discrete Mathematics
We study the maximum weight matching problem in the semi-streaming model, and improve on the currently best one-pass algorithm due to Zelke (STACS'2008) by devising a deterministic approach whose performance guarantee is 4.91. In addition, we study preemptive online algorithms, a sub-class of one-pass algorithms where we are only allowed to maintain a feasible matching in memory at any point in time. All known results prior to Zelke's belong to this sub-class. We provide a lower bound of 4.967 on the competitive ratio of any such deterministic algorithm, and hence show that future improvements will have to store in memory a set of edges which is not necessarily a feasible matching.
-
A polynomial delay algorithm for enumerating approximate solutions to the interval coloring problem
ALENEX 2010
We study the interval constrained coloring problem, a combinatorial problem arising in the interpretation of data on protein structure emanating from experiments based on hydrogen/deuterium exchange and mass spectrometry. The problem captures the challenging task of increasing the spatial resolution of experimental data in order to get a better picture of the protein structure. Since solutions proposed by any algorithmic framework have to ultimately be verified by biochemists, it is important to provide not just a single solution, but a valuable set of candidate solutions. Our contribution is a polynomial delay, polynomial space algorithm for enumerating all exact solutions plus further approxi- mate solutions, whose components are guaranteed to be within an absolute error of one of the optimum. Our experiments indicate that these approximate solutions are reasonably close to the optimal ones, in terms of the cumulative error.
-
Parametric Packing of Selfish Items and the Subset Sum Algorithm
WINE 2009
The subset sum algorithm is a natural heuristic for the classical Bin Packing problem: In each iteration, the algorithm finds among the unpacked items, a maximum size set of items that fits into a new bin. More than 35 years after its first mention in the literature, establishing the worst-case performance of this heuristic remains, surprisingly, an open problem.
We establish the exact approximation ratio of the subset sum algorithm and generalize this to the parametric case. As previous work showed, this implies tight bounds for the Strong Price of Anarchy of the bin packing game. Finally, we study the pure Price of Anarchy of the parametric Bin Packing game for which we show nearly tight upper and lower bounds.
-
Max-coloring paths: Tight bounds and extensions
ISAAC 2009 and Journal of Combinatorial Optimization (special issue)
The max-coloring problem takes as input a graph with a weight function. Given a vertex coloring of the graph, the weight of a color class is the maximum weight any node in the class. The problem is to compute a legal coloring of the graph such that sum of the weights of the color classes is minimized. Max-coloring general graphs is as hard as the classical vertex coloring problem, a special case where vertices have unit weight. Here consider the problem of max-coloring paths and its generalization, max-coloring a broad class of trees. Tight upper and lower bounds on the time complexity of this problem.
-
Popular mixed matchings
ICALP 2009 and Theoretical Computer Science (special issue)
Consider the problem of matching applicants to jobs under one-sided preferences. A matching M is said to be more popular than T if the applicants that prefer M to T outnumber those that prefer T to M. A matching is said to be popular if there is no matching more popular than it.
The main drawback of the popular matching concept is that it is not guaranteed to exist. In this work we address this issue by considering mixed matchings, that is, probability distribution over matchings. We show that popular mixed matchings always exist and design polynomial time algorithms for finding one.
-
Improved Approximation Algorithms for 1.5D Terrain Guarding
STACS 2009 and Algorithmica
We present a 4-approximation algorithm for the problem of placing the fewest guards on a 1.5D terrain so that every point of the terrain is seen by at least one guard.
Our method is based on rounding the linear programming relaxation of the corresponding covering problem. Besides the simplicity of the analysis, which mainly relies on decomposing the constraint matrix of the LP into totally balanced matrices. Unlike previous work, our algorithm generalizes to the weighted and partial versions of the basic problem.
-
An Optimal Incremental Algorithm for Minimizing Lateness with Rejection
ESA 2008
We re-examine the problem of scheduling jobs on a single machine to minimize maximum lateness with rejection. While this problem can be easily solved optimally in polynomial time, we show the following surprising result: there is a fixed permutation of the jobs, such that if we reject the first i jobs from this permutation, we derive an optimal solution for the problem in which we are allowed to reject i jobs. Moreover, we develop an optimal O(n log n) time algorithm to find this permutation.
-
Essential Complex Biological Modules Explain the Centrality-Lethality Rule
PLoS Computational Biology (Highligthed in Nature Genetics)
The enrichment of high-degree nodes in essential proteins, known as the centrality-lethality rule, suggests that the topological prominence of a protein in a protein interaction network may be a good predictor of its biological importance.
We put forward the hypothesis that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, biological processes essential for organism's vitality and composed of proteins that are densely connected. We present a rigorous analysis of five genomewide protein interaction networks for Saccharomyces cerevisiae supporting our hypothesis and reject two previously proposed explanations for the centrality-lethality rule.
-
Approximating the Interval Constrained Coloring Problem
SWAT 2008 and Algorithmica
The interval constrained coloring problem is the abstraction of a problem that arises in biochemistry. The objective is to assign a color or exchange rate to a set of protein residues such that a set of constraints is satisfied. Each constraint is made up of a closed interval (protein segment) and requirements on the number of elements that belong to each color class (exchange rates observed in the experiments).
We prove that testing feasibility is NP-hard and provide approximation algorithms that produce coloring that just slightly violate the constraints.
-
Lagrangian Relaxation and Partial Cover
STACS
2008
Lagrangian Relaxation has been used for designing algorithms for the partial version of a covering problem using an LMP approximation for the prize-collecting version of the problem. Recently, Konemann et al (ESA 2006) showed how to use an α-LMP as a "black-box" to get an (α 4/3)-approximation for any covering problem.
First we show that the result of Konemman et al is best possible for covering problems in general. Then we study a broad class of covering problems--those that can be written with a totally balanced matrix. By carefully analyzing the inner working of the LMP algorithm we can show that the integrally gap of the natural LP formulation is IP ≤ LP (1+31-k) + k cmax. On the negative side we provide a family of instances where IP > LP (1+3-1-k) + (k/2) cmax.
-
Adaptive Local Ratio
SODA 2008
(Best Student Paper Award) and SIAM Journal on Computing
At the heart of every Local Ratio algorithm is the update step in which a certain "model" function is subtracted from the current weight function in order to make some items tight. Subsequently, these tight items are chosen in the solution, a clean-up step is performed, and the procedure is repeated until feasibility is attained.
This work shows how the choice of "model" function may affect the approximation ratio achieved by the algorithm. Indeed, by turning the problem of choosing a good "model" into an optimization problem of its own, we obtain improved approximations for Data Migration to minimize the average vertex completion time.
-
To Fill or not to Fill: the Gas Station Problems
ESA 2007
and ACM Transactions on Algorithms
Suppose you are planning a road trip across the Unites States: You want to go from New York City to San Francisco. With sky rocketing gas prices, how would you plan your trip to spend as little money as possible? Information about gas prices in a given area is available from sites such as AAA.com or GasPriceWatch.com. How can you use this information to plan the cheapest route?
In this paper we design algorithms for this problem, as well as other problems in this framework.
-
Approximation of Partial Capacitated Vertex Cover
ESA 2007
and
SIAM Journal on Discrete Mathematics
We study the partial capacitated vertex cover problem. The input consists of a graph and a covering requirement. Each edge is associated with a demand (or load), and each vertex is associated with a (soft) capacity and a weight. The objective is to find a minimum cost assignment of edges to vertices such that the total demand of assigned edges fulfills the prescribed requirement.
We present a unified framework for approximating different variants of partial capacitated vertex cover. Our approach is based on the Local Ratio technique and sophisticated charging schemes.
-
Combinatorial Algorithms for Data Migration
APPROX 2006 and
Algorithmica
The Data Migration problem is to compute an efficient plan for moving data stored on devices in a network from one configuration to another. It is modeled by a transfer graph--vertices represent the storage devices, and edges represent data transfers. Every transfer takes one unit of time. A vertex can handle one transfer at a time, and it is said to complete when all the edges (transfers) incident on it complete.
For the objective of minimizing the average vertex completion time, we give a primal-dual 3-approximation for unit processing times and a 5.83-approximation for arbitrary processing times. For minimizing the average edge completion time, we give a simple √2-approximation for bipartite graphs. Our analysis is almost tight.
-
Greedy in Approximation Algorithms
ESA 2006
The objective of this paper is to characterize classes of problems for which a greedy algorithm finds solutions close to optimum. To that end, we introduce the notion of k-extendible systems, a natural generalization of matroids, and show that a greedy algorithm is a 1/k-factor approximation for these systems. Many seemly unrelated problems fit in our framework, e.g.: b-matching, maximum profit scheduling and maximum asymmetric TSP.
In the second half we give better-than-greedy linear time approximation algorithms for b-matching with approximation ratios of 1/2 and 2/3 - ε
-
On the Multi-Radius Cover Problem
Information
Processing Letters
An instance of the multi-radius cover problem consists of a graph G=(V,E) with edge lengths l:E->R. Each vertex u ∈ V represents a transmission station for which a transmission radius ru must be picked. Edges represent a continuum of demand points to be satisfied, that is, for every edge (u,v) ∈ E we ask that ru+rv ≥ luv. The cost of transmitting at radius r from vertex u is given by an arbitrary non-decreasing cost function cu(r). Our goal is to find a cover with minimum total cost ∑u cu(ru).
The multi-radius cover problem is NP-hard as it generalizes the well-known vertex cover problem. In this paper we present a 2-approximation algorithm for it.
-
Weighted Popular Matchings
ICALP 2006
Consider the problem of assigning applicants to jobs. Each applicant has a weight and provides a preference list ranking a subset of the jobs. An applicant x may prefer one matching over the other (or be indifferent between them, in case of a tie) based on the jobs x gets in the two matchings and x's personal preference. A matching M is popular if there is no other matching M' such that the applicants who prefer M' over M outweigh those who prefer M over M'.
This paper give efficient algorithms to find a popular matching, or in case none exists, to establish so.
-
A Primal-Dual Approximation for Partial Vertex Cover: Making Educated
Guesses
APPROX 2005 and
Algorithmica (special issue)
We study the partial vertex cover problem. Given a graph G=(V,E), a weight function on the vertices, and an integer s, our goal is to cover all but s edges, by picking a set of vertices with minimum weight. We provide a primal-dual 2-approximation algorithm which runs in O(n log n + m) time by avoiding the exhaustive guessing step of a previous algorithm.
To get around exhaustive guessing, we do a single dual update and make guesses along the way. One may say the algorithm makes educated guesses. The same trick can be used to get a 2-approximation for capacitated vertex cover, a more general version of the basic problem.
-
Challenges in Selecting Paths for Navigational Queries
WebDB 2004
Life sciences sources are characterized by a complex graph of overlapping sources, and multiple alternate links between sources. A (navigational) query may be answered by traversing multiple alternate paths between a start source and a target source. These paths typically have different benefits and evaluation costs. In prior research, we developed ESearch, an algorithm based on a Deterministic Finite Automaton, which exhaustively enumerates all paths to answer a navigational query. The challenge is to develop heuristics that improve on the exhaustive ESearch solution and identify good utility functions that can rank the sources, the links between sources, and the sub-paths that are already visited, in order to quickly produce paths that have the highest benefit and the least cost.
Other
-
Primal-Dual Algorithms for Combinatorial Optimization Problems
Ph.D. Dissertation
Combinatorial optimization problems such as routing, scheduling, covering and packing problems abound in everyday life. Unfortunately, most problems of interest are NP-hard. One way to cope with NP-hardness is to relax the optimality requirement and instead look for solutions that are provably close to the optimum. This is the main idea behind approximation algorithms.
Arguably, one of the most important techniques in the design of combinatorial algorithms is the primal-dual schema in which the cost of the primal solution is compared to the cost of a dual solution. In this dissertation we study the primal-dual schema in the design of approximation algorithms for a number of covering and scheduling problems.
-
Weighted Popular Matchings
Encyclopedia of
Algorithms
Consider the problem of assigning applicants to jobs. Each applicant has a weight and provides a preference list ranking a subset of the jobs. An applicant x may prefer one matching over the other (or be indifferent between them, in case of a tie) based on the jobs x gets in the two matchings and x's personal preference. A matching M is popular if there is no other matching M' such that the applicants who prefer M' over M outweigh those who prefer M over M'.
This entry surveys the work done on weighted popular matchings.