List of Publications
The conference and journal versions of the same paper are aggregated into a single entry. The local links in each entry point to the most uptodate version of a paper. You can hide the abstracts if you like.
Submitted

A 4approximation for scheduling on a single machine with general cost function
Submitted
We consider a single machine scheduling problem that seeks to minimize a generalized cost function: given a subset of jobs we must order them so as to minimize the sum of jobdependent cost functions.
In a recent paper, Cheung and Shmoys provided a primaldual algorithm for the problem and claimed that is a 2approximation. In this paper we show that their analysis cannot yield an approximation guarantee better than 4. We then cast their algorithm as a local ratio algorithm and show that in fact it has an approximation ratio of 4. Additionally, we consider a more general problem where jobs has release dates and can be preempted. For this version we give a 4κapproximation algorithm where κ is the number of distinct release dates. .
Published

Resolving conflicting predictions from multimapping reads
Journal of Computational Biology
The first step in the analysis of data produced by ultrahighthroughput nextgeneration sequencing technology is to map short sequence ‘reads’ to a reference genome, if available. Sequencing errors, repeat regions, and polymorphisms may lead a read to align to multiple locations in the genome reasonably well. While ignoring such multimapping reads or some of their alignments will reduce the sensitivity of almost any type of downstream analysis, erroneous mappings will typically yield false positive predictions. We propose a framework that aims to identify true predictions among a large set of candidate predictions by selecting for each read a unique mapping that collectively imply conflictfree predictions. We formulate this problem as the maximum facility location problem, for which we propose LProunding heuristics. 
Welfare Maximization in Fractional Hedonic Games
IJCAI 2015
We consider the computational complexity of computing welfare maximizing partitions for fractional hedonic games, a natural class of coalition formation games that can be succinctly represented by a graph. For such games, welfare maximizing partitions constitute desirable ways to cluster the vertices of the graph. We present both intractability results and approximation algorithms for computing welfare maximizing partitions 
On the intersection of independence systems
Operations Research Letters
We study the properties of the intersection of independence systems. We show that the intersection of a ksystem and an fsystem is a (k+f)$system. As an application of our results, we show an improved approximation algorithm for stochastic probing with deadlines over ksystems. 
Parametrized Algorithms for Random Serial Dictatorship
Mathematical Social Sciences
Voting and assignment are two of the most fundamental settings in social choice theory. For both settings, random serial dictatorship (RSD) is a wellknown rule that satisfies anonymity, ex post efficiency, and strategyproofness. Recently, it was shown that computing the resulting probabilities is sharpPcomplete both in the voting and assignment setting. In this paper, we study RSD from a parametrized complexity perspective. More specifically, we present efficient algorithms to compute the RSD probabilities under the condition that the number of agent types, alternatives, or objects is bounded.

How unsplittableflowcovering helps scheduling with jobdependent
cost functions
ICALP 2014
We consider the covering version of the wellstudied and prominent unsplittable flow on a path problem. We present a quasipolynomial time $(1+\epsilon)$approximation algorithm.
We also show how this implies a $(e + \epsilon)$approximation for the problem of scheduling jobs on a single machine where each job has its own cost function and the objective to minimize the sum of the completion costs.

A distributed algorithm for largescale generalized matching
VLDB 2013
Generalized matching problems arise in a number of applications, including computational advertising, recommender systems, and trade markets. Consider, for example, the problem of recommending multimedia items (e.g., DVDs) to users such that (1) users are recom mended items that they are likely to be interested in, (2) every user gets neither too few nor too many recommendations, and (3) only items available in stock are recommended to users. Stateoftheart matching algorithms fail at coping with large realworld instances, which may involve millions of users and items. We propose the first distributed algorithm for computing nearoptimal solutions to largescale generalized matching problems like the one above.

Instancesensitive robustness guarantees for sequencing with unknown packing and covering constraints
ITCS 2013
Sequencing problems with an unknown covering or packing constraint appear in various applications, e.g., in realtime computing environments with uncertain runtime availability. A sequence is called \alpharobust when, for any possible constraint, the maximal or minimal prefix of the sequence that satisfies the constraint is at most a factor \alpha from an optimal packing or covering.
In this work we address the fact that many problem instances may allow for a much better robustness guarantee than the pathological worst case instances. We aim for more meaningful, instancesensitive performance guarantees. We present an algorithm that constructs for each instance a solution with a robustness factor arbitrarily close to optimal.

Enabling mobile distributed social networking on smartphones
MSWiM 2012 and
IEEE Transactions on Mobile Computing
Distributed social networking services show promise to solve data ownership and privacy problems associated with centralized approaches. Smartphones could be used for hosting and sharing users data in a distributed manner, if the associated high communication costs and battery usage issues could be mitigated. We propose MobiTribe, a novel networkaware distributed content sharing system for distributed social networking, which addresses these challenges by taking advantage of network availability patterns to provide continuous availability of content over lowcost network connections. MobiTribe groups devices into tribes among intended content consumers, which replicate content and serve it using lowcost network connections by exploiting time elasticity of user generated content dissemination. We develop a grouping algorithm based on a combination of a bipartite bmatching and a greedy heuristic algorithm which groups mobile devices into tribes in polynomial time.

Optimization problems in dotted interval graphs
WG 2012
and
Discrete Applied Mathematics
The class of Ddotted interval (DDI) graphs is the class of intersection graphs of arithmetic progressions with jump at most D. The class generalizes intervals (D1). We consider various classical graphtheoretic optimization problem when restricted to DDI graphs. Our algorithms are based on classical results in algorithmic graph theory and new structural properties of DDI graphs that may be of independent interest.

On treeconstrained matchings and generalizations
ICALP 2011
and
Algorithmica
We study a generalization of maximum weight bipartite matching, where we are given in addition posets over each side of the bipartition and we add the additional requirement that the matched vertices on each side form an independent set in the corresponding poset. We give approximation algorithms and hardness for the problem.

Inferring physical protein contacts from largescale purification data of protein complexes
Molecular and Cellular Proteomics
Recently published largescale data sets of tandemaffinity protein purifications have allowed unprecedented insights into the organization of cellular protein complexes. Several computational methods have been developed to detect cocomplexing proteins in these data, aiding in the identification of biologically relevant protein complexes. While it is now possible to identify key members of protein complexes in a highthroughput fashion, much less is known about the proteinconnecting network of physical contacts within these purifications that determines many aspects of complex formation, dynamics, and disease etiology.
We introduce a novel approach for predicting physical contacts in protein purifications and compare its performance to four established methods for scoring cocomplexed protein pairs based on several highconfidence experimental reference sets.

Bonsai: Growing interesting small trees
ICDM 2010
In this paper, we address the problem of finding such cardinality constrained subgraphs from large nodeweighted graphs that maximize the sum of weights of selected nodes. We provide an efficient constantfactor approximation algorithm for this strongly NPhard problem. A key feature of our approach is that we can simultaneously compute heavy subgraphs for a range of cardinality constraints, thus making it particularly suited for browsing operations during visual analytics. We show how our techniques can be applied in a variety of application settings such as discovering functional modules (most deviant subnetworks) in proteinprotein interaction graphs, identifying active cores of topical subgraphs of the Wikipedia graph, detection of communities in social networks, etc.

When LP is the cure for your matching woes: Improved bounds
for stochastic matchings
ESA 2010 (Best paper award) and Algorithmica (special issue)
In this paper we study stochastic matching problems that are motivated by applications in online dating and kidney exchange programs. Our main result is an algorithm that finds a matchingprobing strategy having a small constant approximation ratio. An interesting aspect of our approach is that we compare the cost our solution to the optimal edgeprobing strategy. Thus, we indirectly show that the best matchingprobing strategy is only a constant factor away from the best edgeprobing strategy.
(This is the merger of two papers: arxiv:1003.0167 and arxiv:1002.3763.)

The checkpoint problem
APPROX
2010
and
Theoretical
Computer Science
In this paper, we consider the checkpoint problem in which given an undirected graph G, a set of sourcedestinations pairs and a set of fixed paths P between them, the goal is to ﬁnd a set of checkpoint edges E' which disconnect each pair and minimize the average or minimize the maximum intersection with each path in P. This problem has several natural applications, e.g., in urban transportation and network security, and in a sense combines the multicut problem and the minimum membership set cover problem. We give approximation algorithms and hardness of approximation for different variants of the problem.

Universal sequencing on a single machine
IPCO
2010 and SIAM Journal on Computing
We consider scheduling on an unreliable machine that may experience unexpected changes in processing speed or even full breakdowns. We aim for a universal solution that performs well without adaptation for any possible machine behavior. For the objective of minimizing the total weighted completion time, we design a polynomial time deterministic algorithm that finds a universal scheduling sequence with a solution value within 4 times the value of an optimal clairvoyant algorithm that knows the disruptions in advance. A randomized version of this algorithm attains a ratio of e. We we that both results are best possible among all universal solutions. We also consider the variant where jobs have release dates.

Assigning Papers to Referees
Algorithmica (special issue)
We study the problem of assigning papers to referees. We propose to optimize a number of criteria that aim at achieving fairness among referees/papers. Some of these variants can be solved optimally in polynomial time, while others are NPhard, in which case we design approximation algorithms. Experimental results strongly suggest that the assignments computed by our algorithms are considerably better than those computed by popular conference management software.

Improved approximation guarantees for weighted matching in the semistreaming model
STACS 2010
and
SIAM Journal on Discrete Mathematics
We study the maximum weight matching problem in the semistreaming model, and improve on the currently best onepass algorithm due to Zelke (STACS'2008) by devising a deterministic approach whose performance guarantee is 4.91. In addition, we study preemptive online algorithms, a subclass of onepass algorithms where we are only allowed to maintain a feasible matching in memory at any point in time. All known results prior to Zelke's belong to this subclass. We provide a lower bound of 4.967 on the competitive ratio of any such deterministic algorithm, and hence show that future improvements will have to store in memory a set of edges which is not necessarily a feasible matching.

A polynomial delay algorithm for enumerating approximate solutions to the interval coloring problem
ALENEX 2010 and Journal of Experimental Algorithms
We study the interval constrained coloring problem, a combinatorial problem arising in the interpretation of data on protein structure emanating from experiments based on hydrogen/deuterium exchange and mass spectrometry. The problem captures the challenging task of increasing the spatial resolution of experimental data in order to get a better picture of the protein structure. Since solutions proposed by any algorithmic framework have to ultimately be veriﬁed by biochemists, it is important to provide not just a single solution, but a valuable set of candidate solutions. Our contribution is a polynomial delay, polynomial space algorithm for enumerating all exact solutions plus further approxi mate solutions, whose components are guaranteed to be within an absolute error of one of the optimum. Our experiments indicate that these approximate solutions are reasonably close to the optimal ones, in terms of the cumulative error.

Parametric Packing of Selfish Items and the Subset Sum Algorithm
WINE 2009
and
Algorithmica
The subset sum algorithm is a natural heuristic for the classical Bin Packing problem: In each iteration, the algorithm ﬁnds among the unpacked items, a maximum size set of items that ﬁts into a new bin. More than 35 years after its ﬁrst mention in the literature, establishing the worstcase performance of this heuristic remains, surprisingly, an open problem.
We establish the exact approximation ratio of the subset sum algorithm and generalize this to the parametric case. As previous work showed, this implies tight bounds for the Strong Price of Anarchy of the bin packing game. Finally, we study the pure Price of Anarchy of the parametric Bin Packing game for which we show nearly tight upper and lower bounds.

Maxcoloring paths: Tight bounds and extensions
ISAAC 2009 and Journal of Combinatorial Optimization (special issue)
The maxcoloring problem takes as input a graph with a weight function. Given a vertex coloring of the graph, the weight of a color class is the maximum weight any node in the class. The problem is to compute a legal coloring of the graph such that sum of the weights of the color classes is minimized. Maxcoloring general graphs is as hard as the classical vertex coloring problem, a special case where vertices have unit weight. Here consider the problem of maxcoloring paths and its generalization, maxcoloring a broad class of trees. Tight upper and lower bounds on the time complexity of this problem.

Popular mixed matchings
ICALP 2009 and Theoretical Computer Science (special issue)
Consider the problem of matching applicants to jobs under onesided preferences. A matching M is said to be more popular than T if the applicants that prefer M to T outnumber those that prefer T to M. A matching is said to be popular if there is no matching more popular than it.
The main drawback of the popular matching concept is that it is not guaranteed to exist. In this work we address this issue by considering mixed matchings, that is, probability distribution over matchings. We show that popular mixed matchings always exist and design polynomial time algorithms for finding one.

Improved Approximation Algorithms for 1.5D Terrain Guarding
STACS 2009 and Algorithmica
We present a 4approximation algorithm for the problem of placing the fewest guards on a 1.5D terrain so that every point of the terrain is seen by at least one guard.
Our method is based on rounding the linear programming relaxation of the corresponding covering problem. Besides the simplicity of the analysis, which mainly relies on decomposing the constraint matrix of the LP into totally balanced matrices. Unlike previous work, our algorithm generalizes to the weighted and partial versions of the basic problem.

An Optimal Incremental Algorithm for Minimizing Lateness with Rejection
ESA 2008
We reexamine the problem of scheduling jobs on a single machine to minimize maximum lateness with rejection. While this problem can be easily solved optimally in polynomial time, we show the following surprising result: there is a fixed permutation of the jobs, such that if we reject the first i jobs from this permutation, we derive an optimal solution for the problem in which we are allowed to reject i jobs. Moreover, we develop an optimal O(n log n) time algorithm to find this permutation.

Essential Complex Biological Modules Explain the CentralityLethality Rule
PLoS Computational Biology (Highligthed in Nature Genetics)
The enrichment of highdegree nodes in essential proteins, known as the centralitylethality rule, suggests that the topological prominence of a protein in a protein interaction network may be a good predictor of its biological importance.
We put forward the hypothesis that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, biological processes essential for organism's vitality and composed of proteins that are densely connected. We present a rigorous analysis of five genomewide protein interaction networks for Saccharomyces cerevisiae supporting our hypothesis and reject two previously proposed explanations for the centralitylethality rule.

Approximating the Interval Constrained Coloring Problem
SWAT 2008 and Algorithmica
The interval constrained coloring problem is the abstraction of a problem that arises in biochemistry. The objective is to assign a color or exchange rate to a set of protein residues such that a set of constraints is satisfied. Each constraint is made up of a closed interval (protein segment) and requirements on the number of elements that belong to each color class (exchange rates observed in the experiments).
We prove that testing feasibility is NPhard and provide approximation algorithms that produce coloring that just slightly violate the constraints.

Lagrangian Relaxation and Partial Cover
STACS
2008
Lagrangian Relaxation has been used for designing algorithms for the partial version of a covering problem using an LMP approximation for the prizecollecting version of the problem. Recently, Konemann et al (ESA 2006) showed how to use an αLMP as a "blackbox" to get an (α 4/3)approximation for any covering problem.
First we show that the result of Konemman et al is best possible for covering problems in general. Then we study a broad class of covering problemsthose that can be written with a totally balanced matrix. By carefully analyzing the inner working of the LMP algorithm we can show that the integrally gap of the natural LP formulation is IP ≤ LP (1+3^{1k}) + k c_{max}. On the negative side we provide a family of instances where IP > LP (1+3^{1k}) + (k/2) c_{max}.

Adaptive Local Ratio
SODA 2008
(Best Student Paper Award) and SIAM Journal on Computing
At the heart of every Local Ratio algorithm is the update step in which a certain "model" function is subtracted from the current weight function in order to make some items tight. Subsequently, these tight items are chosen in the solution, a cleanup step is performed, and the procedure is repeated until feasibility is attained.
This work shows how the choice of "model" function may affect the approximation ratio achieved by the algorithm. Indeed, by turning the problem of choosing a good "model" into an optimization problem of its own, we obtain improved approximations for Data Migration to minimize the average vertex completion time.

To Fill or not to Fill: the Gas Station Problems
ESA 2007
and ACM Transactions on Algorithms
Suppose you are planning a road trip across the Unites States: You want to go from New York City to San Francisco. With sky rocketing gas prices, how would you plan your trip to spend as little money as possible? Information about gas prices in a given area is available from sites such as AAA.com or GasPriceWatch.com. How can you use this information to plan the cheapest route?
In this paper we design algorithms for this problem, as well as other problems in this framework.

Approximation of Partial Capacitated Vertex Cover
ESA 2007
and
SIAM Journal on Discrete Mathematics
We study the partial capacitated vertex cover problem. The input consists of a graph and a covering requirement. Each edge is associated with a demand (or load), and each vertex is associated with a (soft) capacity and a weight. The objective is to find a minimum cost assignment of edges to vertices such that the total demand of assigned edges fulfills the prescribed requirement.
We present a unified framework for approximating different variants of partial capacitated vertex cover. Our approach is based on the Local Ratio technique and sophisticated charging schemes.

Combinatorial Algorithms for Data Migration
APPROX 2006 and
Algorithmica
The Data Migration problem is to compute an efficient plan for moving data stored on devices in a network from one configuration to another. It is modeled by a transfer graphvertices represent the storage devices, and edges represent data transfers. Every transfer takes one unit of time. A vertex can handle one transfer at a time, and it is said to complete when all the edges (transfers) incident on it complete.
For the objective of minimizing the average vertex completion time, we give a primaldual 3approximation for unit processing times and a 5.83approximation for arbitrary processing times. For minimizing the average edge completion time, we give a simple √2approximation for bipartite graphs. Our analysis is almost tight.

Greedy in Approximation Algorithms
ESA 2006
The objective of this paper is to characterize classes of problems for which a greedy algorithm finds solutions close to optimum. To that end, we introduce the notion of kextendible systems, a natural generalization of matroids, and show that a greedy algorithm is a 1/kfactor approximation for these systems. Many seemly unrelated problems fit in our framework, e.g.: bmatching, maximum profit scheduling and maximum asymmetric TSP.
In the second half we give betterthangreedy linear time approximation algorithms for bmatching with approximation ratios of 1/2 and 2/3  ε

On the MultiRadius Cover Problem
Information
Processing Letters
An instance of the multiradius cover problem consists of a graph G=(V,E) with edge lengths l:E>R. Each vertex u ∈ V represents a transmission station for which a transmission radius r_{u} must be picked. Edges represent a continuum of demand points to be satisfied, that is, for every edge (u,v) ∈ E we ask that r_{u}+r_{v} ≥ l_{uv}. The cost of transmitting at radius r from vertex u is given by an arbitrary nondecreasing cost function c_{u}(r). Our goal is to find a cover with minimum total cost ∑_{u} c_{u}(r_{u}).
The multiradius cover problem is NPhard as it generalizes the wellknown vertex cover problem. In this paper we present a 2approximation algorithm for it.

Weighted Popular Matchings
ICALP 2006 and ACM Transactions on Algorithms
Consider the problem of assigning applicants to jobs. Each applicant has a weight and provides a preference list ranking a subset of the jobs. An applicant x may prefer one matching over the other (or be indifferent between them, in case of a tie) based on the jobs x gets in the two matchings and x's personal preference. A matching M is popular if there is no other matching M' such that the applicants who prefer M' over M outweigh those who prefer M over M'.
This paper give efficient algorithms to find a popular matching, or in case none exists, to establish so.

A PrimalDual Approximation for Partial Vertex Cover: Making Educated
Guesses
APPROX 2005 and
Algorithmica (special issue)
We study the partial vertex cover problem. Given a graph G=(V,E), a weight function on the vertices, and an integer s, our goal is to cover all but s edges, by picking a set of vertices with minimum weight. We provide a primaldual 2approximation algorithm which runs in O(n log n + m) time by avoiding the exhaustive guessing step of a previous algorithm.
To get around exhaustive guessing, we do a single dual update and make guesses along the way. One may say the algorithm makes educated guesses. The same trick can be used to get a 2approximation for capacitated vertex cover, a more general version of the basic problem.

Challenges in Selecting Paths for Navigational Queries
WebDB 2004
Life sciences sources are characterized by a complex graph of overlapping sources, and multiple alternate links between sources. A (navigational) query may be answered by traversing multiple alternate paths between a start source and a target source. These paths typically have different benefits and evaluation costs. In prior research, we developed ESearch, an algorithm based on a Deterministic Finite Automaton, which exhaustively enumerates all paths to answer a navigational query. The challenge is to develop heuristics that improve on the exhaustive ESearch solution and identify good utility functions that can rank the sources, the links between sources, and the subpaths that are already visited, in order to quickly produce paths that have the highest benefit and the least cost.
Other

PrimalDual Algorithms for Combinatorial Optimization Problems
Ph.D. Dissertation
Combinatorial optimization problems such as routing, scheduling, covering and packing problems abound in everyday life. Unfortunately, most problems of interest are NPhard. One way to cope with NPhardness is to relax the optimality requirement and instead look for solutions that are provably close to the optimum. This is the main idea behind approximation algorithms.
Arguably, one of the most important techniques in the design of combinatorial algorithms is the primaldual schema in which the cost of the primal solution is compared to the cost of a dual solution. In this dissertation we study the primaldual schema in the design of approximation algorithms for a number of covering and scheduling problems.

Weighted Popular Matchings
Encyclopedia of
Algorithms
Consider the problem of assigning applicants to jobs. Each applicant has a weight and provides a preference list ranking a subset of the jobs. An applicant x may prefer one matching over the other (or be indifferent between them, in case of a tie) based on the jobs x gets in the two matchings and x's personal preference. A matching M is popular if there is no other matching M' such that the applicants who prefer M' over M outweigh those who prefer M over M'.
This entry surveys the work done on weighted popular matchings.