Project offerings 2016

Click on the supervisors name for a description of the projects they are offering.

Projects will be added over the coming weeks.

Supervisor

Project/s

Wei Bao  Coordinating cloud-based computing in mobile networks
Vera Chung     Video Object tracking for visiting applications
Moving object detection for video surveillance
Study of behaviour detection algorithm for objects in video streams
Joseph Davis    Humans in the Loop: Improving the Performance of Computer Systems for Image Analysis and Understanding
Vincent Gramoli        USyd Malware Traffic Analysis
Blockchain: Can We Reach Consensus?
Benchmarking Concurrent Data Structures on Multi-core Machines
Evaluating Consensus Protocols in Distributed Systems
Joachim Gudmundsson    Algorithms for team sports analysis
Turbocharging Treewidth Heuristics
Ralph Holz           Mining software repositories for security
A deep look at the Domain Name System
JScan - Analysis of JavaScript on the Web

Mail Graph: analysing the security of email transport over the Internet

A notary system for email
Reproducibility of empirical results
Investigating the security of Internet routing
Watching the watchers in the Web PKI
Seok-Hee Hong     Big Data Visual Analytics
Bryn Jeffries

Classifying Sleep Apnoea

In-database motif detection of time series data

Feature extraction of speech task data

Jinman Kim

Medical Image Visualisation

Ray Profile Analysis for Volume Rendering Visualisation
MDT Visualisation
Exploratory Visualisation with Virtual and Augmented Reality Devices
Medical Avatar
Irena Koprinska     Predicting hospital admissions
Classification of breathing patterns using machine learning techniques
Revealing the development of the immune response through dynamic clustering
Kevin Kuan      Emotional Contagion in Online Social Networks
Factors Affecting User Perception of Online Consumer Reviews
Factors Affecting Consumer Behavior in Group-Buying
Factors Affecting the Effectiveness of Mobile Personalisation
Information Overload in Online Decision Making
Development of a Scale for Assessing eHealth Literacy in Australia
Na Liu   Information system design for aging population
Online self-diagnosis
Design and implement a network-based social support system
David Lowe   Augmented reality remotely-accessed labs
Architecture for collaborative remotely-accessed labs
Remote control of physical systems: the role of context
Simon Poon   Analytical models for Causal Complexities
A Framework for Health Technology Evaluation
A framework to model organizational Complexities in IT business value study
Bernhard Scholz       

Datalog Evaluation Strategies Based on Learning

Efficient Data-Structures for Logical Relations

Magic Set Transformation in Soufflé

Automatic Deduction of Complexity Classes in Soufflé

Components in Soufflé

Infinite Lattices in Soufflé

Benchmarking and Test Harness in Soufflé

Parallelization of Stream Programs

Xiuying Wang    Intelligent recommender
Cognitive Decision System based on Behavioural Mining
Dynamic drop/creation of database views
Interactive Decision-Making Dashboard
Zhiyong Wang and David Feng   Multimedia Data Summarization
Predictive Analytics of Big Time Series Data
Human Motion Analysis, Modeling, Animation, and Synthesis
Bing Zhou Computing and visualising transcription factor interaction networks
Albert Zomaya     Addressing Interoperability in the Internet of Things
Computation Offload in Fog Computing
Handling Big Data-Streams in the Internet of Things
Task Scheduling Algorithms for the Internet of Things
Virtualization of Wireless Sensor Networks
Corporate networks and cascading failures: how shareholding patterns are influential in potential financial meltdowns
Placement matters in making good decisions: understanding how you can improve your cognitive decision making abilities by re-positioning yourself in your network
Is the society preventing you from getting married? Cracking the matching problem using network theory

 

Projects supervised by Wei Bao

Coordinating cloud-based computing in mobile networks
The current mobile system has realized our long-held ambition of anywhere and anytime high-speed broad-band communication. Meanwhile, the emergence of cloud computing services allows us to use pools of always-on computing resources at low cost and in an on-demand fashion. They together stimulate the booming of new smart services and applications. There are many new exciting R&D opportunities where cloud computing services are integrated in mobile networks to further improve their performance and functionalities.

In this project, we aim to study the optimal coordination between mobile communication and computation in an integrated communication-computation paradigm. The designed schemes are expected to improve network performance and user experience, as well as to satisfy a variety of constraints such as throughput limitations at wireless links, power limitations at mobile devices, computing capacity constraints at the central cloud, and delay constraints in data transmission and data processing. This project is suitable for self-motivated students who are interested in analyzing, designing, measuring, and/or implementing networked systems.

Projects supervised by Vera Chung

Video Object tracking for visiting applications
This project is to study object tracking algorithms, including Particle Swarm Optimization (PSO), and apply the algorithms into video tracking in visiting applications. We have real-world animal video data captured from a zoo. You will implement the object tracking algorithms for these videos for animal tracking, give performance comparisons, and customized or improve the tracking results for this kind of applications. It is also possible to improve the tracking results by combining the video data and the simultaneous RFID data available (offline).

Requirements: good programming in C++ or Java

Moving object detection for video surveillance
In visual surveillance of both humans and vehicles, a video stream is processed to characterize the events of interest through the detection of moving objects in each frame. The majority of errors in higher-level tasks such as tracking are often due to false detection. This project will design and implement a new system to detect the moving objects in surveillance applications.

Requirements: good programming in C++ or Java

Study of behaviour detection algorithm for objects in video streams
Object motion plays an important role in determining the semantic understanding of videos. Object motion analysis and behavior understanding are the keys for developing intelligent systems in many domains such as visual surveillance and handwriting recognition. This project will survey and study object trajectory clustering which includes trajectory segmentation to identify a number of sub-trajectories in a trajectory, feature vector sequence formation, and visual word sequence creation.

Requirements: good programming in Python

Projects supervised by Joseph Davis

Humans in the Loop: Improving the Performance of Computer Systems for Image Analysis and Understanding
While there has been some improvement in the performance of computer systems for image understanding (identifying discrete objects, recognising the relationship between the objects, recognizing any action taking place etc. in order to provide accurate captions), it is widely believed that the progress is slow. This project will involve extending existing algorithms by integrating human computation (accessed through on-demand crowdsourcing platforms such as Mechanical Turk) and a set of experiments to comparatively investigate the performance of computer and computer-human hybrid systems in image analysis and understanding tasks.

Projects supervised by Vincent Gramoli

USyd Malware Traffic Analysis
In 2014, McAfee estimated the annual cost to the global economy from cybercrime as more than $400 billion. This includes the cost related to the stolen personal information of hundreds of millions of people.
In the 2012 report commissioned by the Australia’s National CERT about cybercrime and security survey, 20% of the surveyed companies identified cybersecurity incidents in the last year, with 21% of these incidents of type trojan/rootkit malware.

The goal of this project is to analyze the traffic at the University of Sydney using powerful multi/many-core servers connected through a 10Gbps network running Intrusion Detection and Prevention Systems to learn about the usage of malware by the machines accessing the Internet from the University campus.

The first phase of the project consists of deploying software components at the servers located between the university network and the Internet to help identifying malware threats. The second phase of the project consists of gathering accesses to infected websites. The third phase of the project consists of quantifying the threats by analyzing the collected data and drawing conclusions.

The project requires knowledge of some network technologies like tcpdump, wireshark, pcap files or suricata.

Blockchain: Can We Reach Consensus?
Blockchain is a disruptive technology promising to minimize the cost of ownership transfers over internet. While this technology was already shown promising for exchanging crypto-currencies, like BitCoins, they can potentially transfer various types of assets [1]. The key underlying principle is a distributed ledger to which miners can append a block encoding the newly requested transactions. Once recorded, blocks are immutable, hence allowing participants to audit and verify the blockchain since its genesis block.

To prevent blocks with conflicting transactions from being appended concurrently to the chain, participants must run a consensus algorithm that guarantees the total order of the blocks. Unfortunately, consensus has been known to be unsolvable since three decades [2] but practical systems were designed to get as close to a solution as possible. As a financial transaction is a clear incentive for malicious users to break the consensus properties, it is crucial to understand the weaknesses of current implementations of Byzantine agreement [3] to improve existing alternatives.

The goal of this research project is to investigate existing solutions whose code is available, like Ethereum [1], and design an efficient and secure consensus prototype for blockchains.

[1] Ethereum
[2] Impossibility of Distributed Consensus with One Faulty Process. Fischer, Michael J. and Lynch, Nancy A. and Paterson, Michael S. JACM 1985.
[3] The Byzantine Generals Problem. Lamport, Leslie and Shostak, Robert and Pease, Marshall. TOPLAS 1982.

Benchmarking Concurrent Data Structures on Multi-core Machines
For the last decade, manufacturers have increased the number of processors, or cores, in most computational devices rather than their frequency. This trend led to the advent of chip multiprocessors that offer nowadays between tens to a thousand cores on the same chip. Concurrent programming, which is the art of dividing a program into subroutines that cores execute simultaneously, is the only way for developers to increase the performance of their software.

These multicore machines adopt a concurrent execution model where, typically, multiple threads synchronize with each other to exploit cores while accessing in-memory shared data. To continue the pace of increasing software efficiency, performance has to scale with the amount of concurrent threads accessing shared data structures. The key is for new synchronization paradigms to not only leverage concurrent resources to achieve scalable performance but also to simplify concurrent programming so that most programmers can develop efficient software.

We have developed Synchrobench in C/C++ and Java [1], the most comprehensive set of synchronization tools and concurrent data structure algorithms. The implementations of concurrent data structures and synchronization techniques keep multiplying. The goal of this project is to extend the Synchrobench benchmark suite with new concurrent data structures and to compare their performance against the 30+ existing algorithms on our multi-core machines to conclude on the algorithm design choices to adopt to maximize performance in upcoming concurrent data structures.

[1] Synchrobench
[2] Vincent Gramoli. More Than You Ever Wanted to Know about Synchronization: Synchrobench, Measuring the Impact of the Synchronization on Concurrent Algorithms. PPoPP 2015.

Evaluating Consensus Protocols in Distributed Systems
Distributed system solutions, like CoreOS used by Facebook, Google and Twitter, exploit a key-value store abstraction to replicate the state and a consensus protocol to totally order the state machine configurations. Unfortunately, there is no way to reconfigure this key-value store service, to include new servers or exclude failed ones, without disruption.

The Paxos consensus algorithm that allows candidate leaders to exchange with majorities could be used to reconfigure a key-value store as well [4]. To circumvent the impossibility of implementing consensus with asynchronous communications, Paxos guarantees termination under partial synchrony while always guaranteeing validity and agreement, despite having competing candidate leaders proposing configurations.

Due to the intricateness of the protocol [1] the tendency had been to switch to an alternative algorithm where requests are centralized at a primary. Zab, a primary-based atomic broadcast protocol, was used in Zookeeper [2], a distributed coordination service. Raft [1] reused the centralization concept of Zookeeper to solve consensus. The resulting simplification led to the development of various implementations of Raft in many programming languages.
The goal of this project is to compare a Raft-based implementation to Paxos-based implementations [3] to confirm that Paxos can be better suited than Raft in case of leader failures and explore cases where Raft could be preferable.

[1] Diego Ongaro and John Ousterhout. In search of an understandable consensus algorithm. In ATC, pages 305–319, Philadelphia, PA, 2014. USENIX.
[2] Flavio Junqueira and Benjamin Reed. ZooKeeper: Distributed Process Coordination. O’Reilly Media, Nov. 2013.
[3] Vincent Gramoli, Len Bass, Alan Fekete, Daniel Sun. Rollup: Non-Disruptive Rolling Upgrade. USyd Technical Report 699.

Projects supervised by Joachim Gudmundsson

Algorithms for team sports analysis
There is currently considerable interest in research for developing objective methods of analysing sports, including football (soccer). Football analysis has practical applications in player evaluation for coaching and scouting; development of game and competition strategies; and also to enhance the viewing experience of televised matches. Until recently, the analysis of football matches was typically done manually or by using simple frequency analysis. However, such analysis was primarily concerned with what happened, and did not consider where, by whom or why. Recent innovations in image processing and object detection allow accurate spatio-temporal data on the players and ball to be captured. This project looks at how to use tools from machine learning and computational geometry to develop provably good analytical tools for evaluating a team or a player.

Good programming skills and good algorithmic background are required.

Knowledge of basic aspects of machine learning would be very useful but not essential.

Turbocharging Treewidth Heuristics
Treewidth is a number associated with graphs that measures (informally speaking) how cyclic or acyclic a given graph is. A by-product of computing the treewidth are so-called tree decompositions. They have many applications in algorithms for computationally hard problems in a variety of areas such as for example optimization, networks and AI. Unfortunately, computing the treewidth and tree decompositions is itself computationally hard. Therefore, in practice heuristics are commonly used for calculating them.

The goal of this project is to improve the performance of these heuristics by using recent insight from parameterized complexity theory. Besides the implementation of the theoretical results, an important part of the project is benchmarking the new heuristics and comparing them with the old ones. In case the project is a success, the refinement of the treewidth heuristics would immediately speed up all the algorithms that are based on tree decompositions.

Requirements: Strong programming skill; analytical skills and sound basic knowledge in mathematics.

Projects supervised by Ralph Holz

Mining software repositories for security
Version control systems (VCS) are indispensable in modern software development. They are also extremely useful as a source of meta-information: they keep track of the contributions of different authors. Meanwhile, recent work has shown that programmers have distinct styles that can be distinguished by machines.

This project has its root in the SecOps space. We are going to develop a (prototypical) support system for a VCS environment (e.g. GitLab or similar). The support system learns from historical commits and checks new code commits for patterns that are untypical for the programmer in question. The goal is to help the programmer avoid mistakes by flagging unusual commits for later inspection - and possibly detect intrusions where an attacker has gained access to the system and attempts to commit in someone else's name.

This project is in collaboration with Data61 (formerly NICTA). We are going to base our work on parsing and visualisation code that has been developed at NICTA. We first develop a methodology to identify features in code commits that are indicative of a given programmer's style and then devise and evaluate detection mechanisms, possibly from the area of machine learning.

A deep look at the Domain Name System
The Domain Name System is one of the oldest Internet technologies in existence. Insofar as one can speak of organic growth, this is where it most certainly has taken place: the DNS specifications leave much room for interpretation and configuration options, and consequently no-one understands the DNS in its entirety. It is time to take a stab at it.

In this work, we will build on a Python framework for scanning the DNS that was previously developed at Technical University of Munich (TUM), Germany. We are going to redevelop parts of it in the Go language and evaluate performance speed-ups. We are also going to evaluate storage options, e.g. storage in graph databases or relational (SQL) ones. Finally, we are going to inspect selected parts of the DNS trees for anomalies and security weaknesses.

JScan - Analysis of JavaScript on the Web
JavaScript is the mainstream technology that makes the Web interactive - almost every website today uses it. The language is far from perfect, however. As a result, the code quality of many websites is subpar.

This project is going to be carried out in collaboration with Data61 (formerly NICTA). We are going to analyse the code quality on the Web and identify the most common sources of errors and, possibly, weaknesses. We are also going to look at the use of popular libraries, identify potentially dangerous calls to browser APIs, and finally look for code patterns that are typical for infected websites.

We can build on a code base that has been developed at Data61 and scale up to consider several million web sites in our work.

Mail Graph: analysing the security of email transport over the Internet
Email is transported over the Internet over a series of hops from sender to receiver. The route the email takes is recorded in the email header. Analysis of the headers of a large email dataset can yield deep insights into the deployment of email software.

In this project, we are going to develop the graph that connects the Internet's mail server and analyse the transport chosen between servers. For each identified link, we are going to determine the security properties and assess the overall achieved security.

The data set for this work may come from two sources. First, an existing data set of emails for a number of accounts that goes back to 2001. Second, we may be able to extract email headers from live data.

Note: ethics approval is essential to this work. If you are interested, please let the supervisor know as soon as possible so we can arrange for this.

A notary system for email
The goal of this project is to improve the security of a key Internet system, namely email, by utilising large amounts of observational and environmental data from multiple sources. Key to this undertaking is the intelligent exploitation of data obtained from network (security) measurements. The desired outcome is a demonstrable mechanism to support security decisions in as automated a fashion as possible.

Emails are transported between so-called mail transfer agents (MTAs). There is currently no working Public Key Infrastructure for these. This leaves sensitive information often entirely unprotected in transit. However, MTAs could enable better security if they had access to information that can be collected by measurement.

In this project, we are going to build a system that relies on distributed security scans from several vantage points to obtain reasonable assurance which mappings exist between IP addresses and public keys of hosts. We then implement this security for an email software, e.g. MailPile.

Reproducibility of empirical results
Computer Science as a discipline is relatively young. Compared to classic sciences (such as physics), there is no perfectly developed methodology yet to document experiments in a generic way and provide data sets for the reuse of others. This is owing to a multitude of factors that distinguish computer science from other sciences and make it difficult to reproduce an experiment: for example, the need to anonymise data sets, the many subtle differences in different software implementations, or the fast pace at which the Internet is evolving. However, the community has made remarkable steps forward in the past years, with more data sets becoming available and providing them for the use of others being an important criterion in publications.

In this work, we are going to select several representative large data sets from the PREDICT collection and the Censys search engine. We then evaluate them for reproducibility, i.e. we determine to which degree is it possible to reproduce the measurements faithfully. To this end, we evaluate the datasets with some fundamental statistics and attempt to replicate the experiment.

Investigating the security of Internet routing
It is a curious but true fact: the functionality that the Internet provides is only made possible thanks to a protocol that is entirely insecure: BGP. Previous research has shown how easy attacks on this protocol really are. The fact that it is not exploited at large scale can only be reasonably explained by the fact that there are still easier targets. This is going to change at some point, and it is wise to be prepared for the moment when BGP becomes of serious interest to malicious parties.

The BGP community has drafted several standards to improve the security of the protocol. One of them is the so-called Resource Public Key Infrastructure (RPKI). The goal of this work is to analyse the state this technology is in - what deployment has it achieved, how fast is it being rolled out, and how many networks can it protect efficiently. To this end, we are going to develop a framework that can collect the relevant data and make it accessible to our analysis. Based on the insights we gather, we are going to compare the success of the RPKI to other efforts.

Watching the watchers in the Web PKI
The security of our communications on the WWW is shaky, at best. To a large degree, this is due to the vulnerability of particular organisations, the so-called Certification Authorities (CAs), who play a key role in the prevailing technology, the so-called X.509 Public Key Infrastructure.

Google is currently pushing a novel technology, Certificate Transparency (CT), that has the potential to uncover malicious activity against Web sites and X.509. In CT, neutral observers log the activities of Certification Authorities, which makes it possible to for us check that they do their job properly, and any misbehaviour can be detected. The crux, of course, is that the logs could misbehave themselves. Google thus provides technology to volunteers to track the behaviour of both logs and CAs - so-called Monitors.

As the technology is brand-new, ours will be among the first efforts to track X.509 use in this particular way. We are going to use (and further develop!) Google's code to set up Monitors and will use our installation to analyse the CT and CAs over the period of one month. We will pay particular attention to the security of high-profile and Australian web sites.

Projects supervised by Seok-Hee Hong

Big Data Visual Analytics
Technological advances such as sensors have increased data volumes in the last few years, and now we are experiencing a “data deluge” in which data is produced much faster than it can be used by humans.
Further, these huge and complex data sets have grown in importance due to factors such as international terrorism, the success of genomics, increasingly complex software systems, and widespread fraud on stock markets.

Visual analytics is the science of analytical reasoning facilitated by interactive visual interface.

This project aims to develop new visual representation, visualization and interaction methods for humans to find patterns in huge abstract data sets, especially network data.

These data sets include social networks, telephone call networks, biological networks, physical computer networks, stock buy-sell networks, and transport networks. These new visualization and interaction methods are in high demand by industry.

Projects supervised by Bryn Jeffries

The CRC for Alertness, Safety and Productivity is a collaboration between industrial and university partners to develop products that improve alertness in the workplace and during commutes. The Alertness Database project, led by Dr Bryn Jeffries in the School of Information Technologies, is developing a cloud-based database suitable for collecting and sharing the research data collected by clinical research partners, and will be the foundation for data analysis and data mining activities.

Classifying Sleep Apnoea
Sleep Apnoea is a common condition in which breathing is obstructed during sleep, leading to low quality sleep and day-time drowsiness. This condition is thought to be the underlying cause of many road accidents. Several treatments are available, but their success is varied and so far unpredictable. In this study you would explore data from recent and ongoing studies to identify common clusters of sleep apnoea symptoms and build a classifier to identify the most effective treatment. This is a great opportunity to work within a large multidisciplinary research group and collaborate with researchers on a project that could have a significant impact for many people’s lives. It would suit a student interested in data-mining, and would require good communication skills. You would be expected to get acquainted with data manipulation packages such as R or Matlab, and potentially develop software of your own in e.g., Java, Python or C/C++.

In-database motif detection of time series data
Adapt a state-of-the-art motif detection algorithm to work within a PostgreSQL DBMS, handling issues such as paged memory and missing data, and investigate methods to optimise the algorithm through additional data structures. Apply to real datasets of, e.g., activity data from insomnia patients. This is honours or possibly strong Masters 18CP student project. Prerequisites are INFO3404 or INFO3504 (or equivalent), and student must be comfortable writing in C.

Feature extraction of speech task data
Investigate automated segmentation algorithms to isolate key regions of speech recordings; implement candidate feature extraction algorithms to characterise the quality of speech for region; apply techniques such as Principle Component Analysis to identify the most distinctive features from extended wakefulness studies. This is honours or Masters 18CP student project. Student may need to work with C/C++ code so they should be comfortable with these languages.

Projects supervised by Jinman Kim

Medical Image Visualisation
With the next-generation of medical imaging scanners, new diagnostic capabilities are resulting in improved patient care. These medical images are multi-dimensional (3D), multi-modality (fusion of e.g., PET and MRI) and also time-varying (e.g., 3D volumes taken over multiple time points and functional MRI). Rapid advances in GPU hardware coupled with smart image processing / rendering algorithms are allowing image visualisation techniques that can render realistic and detailed 3D volumes of the human body. Despite these developments, visualisation still rely on operator interactions to manipulate complex parameters for optimal rendering.
We have several ongoing visualisation projects at the biomedical and multimedia information technology (BMIT) research group. It involves new innovative visualisation algorithms with the use of emerging hardware devices (e.g., Oculus Rift and HoloLens, couples with Kinect/Leap motion). We seek students who can bring their own experiences and interests in visualisation to take on new projects listed below. Students will join a team of researchers and will have the opportunity to work at the clinical environment with clinical staffs / students; as well as to join the institute of Biomedical engineering and technology (BMET) (Level 5 West, School of IT Building).

Ray Profile Analysis for Volume Rendering Visualisation
A key requirement for medical image volume rendering visualisation is to identify regions of interests (ROI), such as a tumour, within a volume such that these ROIs can be prioritised in the 3D rendering. State-of-the-art approaches rely on manual or semi-automated ‘image segmentation’ algorithms to identify the ROIs. However, such approaches are time consuming and difficult to use and thus limiting its application in the clinical setting. In this project, we will introduce a new approach to automatically identifying ROIs by using the information within the ‘ray’ in volume rendering. We propose a ‘ray profile classifier’ such that large image collection database can be used to identify patterns in the ray profile; these ROIs can then be used for improved visualisation.

MDT Visualisation
Multidisciplinary team meetings (MDTs) are the standard of care in modern clinical practice. MDTs typically comprise members from a variety of clinical disciplines involved in a patient’s care. In MDT, imaging is critical to decision-making and therefore it is important to be able to communicate the image data to other members. However, the concept of changing the image visualisations for different members, to aid in interpretation, is currently not available. In this project, we will design and develop new MDT visualisations, where we propose the use of a novel ‘optimal view selection’ algorithm to transform the image visualisation to suit the needs of the individual team members. In this approach, a set of visual rules (via qualitative and quantitative modelling) will be defined that ensures the selection of the view that best suits the needs of the different users. Our new MDT visualisation will facilitate better communication between all the clinicians involved in a patient’s care and potentially improve patient outcomes.

Exploratory Visualisation with Virtual and Augmented Reality Devices
Statistical analysis of medical images are providing new scientific and clinical insights into the data with capabilities of e.g., characterising traits of schizophrenia with functional MRI. Although these data, which includes images and statistics, are multi-dimensional and complex, it currently relies on traditional 2D ‘flat’ displays with mouse-and-keyboard input. Due to the constrained screen space and an implied concept of depth, they are limited in presenting a meaningful, uncluttered view of the data without compromising on preserving semantic (human anatomy) context. In this project, we will explore new emerging visualisation hardware for medical image visualisation, including virtual reality (VR) and augmented reality (AR), coupled with gesture-based inputs to create an immersive environment for visualising these data. We suggest that VR/AR can reduce visual clutter and allow users to navigate the data in a ‘natural’ way that lets them keep their focus on the exploratory visualisations.

Medical Avatar
With the continuing digital revolution in the healthcare industry, patients are being presented with more health data than ever before, which now includes wellness and fitness data from various sensors. Current personal health record (PHR) systems do a good job of storing and consolidating this data, but are limited in facilitating patient understanding through data visualisation. One reason for this stems from the lack of semantic context (human anatomy) that can be used to present the spatial and temporal data in the PHR. Further, a lack of meaningful visualisation techniques exists in the PHR interfaces. In this project, we will design and develop a data processing and consolidation framework to visualise wide range of health data in a visual format. This will rely on the use of patient-specific anatomical atlas of the human body, which we refer to as the ‘avatar’ to be constructed from patient’s health data. The framework will also include a navigable timeline of health events. This project will build up on our existing research on web-based PHR visualisation system.

Projects supervised by Irena Koprinska

Predicting hospital admissions
Co-supervised with Michael Dinh (RPA Hospital)
The goal of this project is to predict if a patient will get admitted to hospital based on the patient's characteristics and symptoms when presented at the Emergency Department. The project will explore the application of machine learning techniques to create an accurate prediction rule that can be used by triage nurses to improve the patient flow in emergency departments. The project will use a large dataset from several hospitals in NSW.

Classification of breathing patterns using machine learning techniques
Co-supervised with Cindy Thamrin and Chinh Nguyen (Woolcock Institute of Medical Research), Mark Read (Charles Perkins Centre)
It is increasingly recognised that respiratory measures such as breath intervals or airway mechanics fluctuate in a non-random manner over time, from minutes, to days and months. Patterns in such fluctuations change with treatment and disease status, and even predict treatment response and future breathing attacks in asthma.

This project will investigate the diagnosis of respiratory disease through these fluctuations, which offers advantage over current approaches to diagnosing asthma and chronic obstructive pulmonary disease through spirometry, which is difficult to perform. While features can be extracted from the breathing pattern or airway mechanics over minutes or months, there have been no attempts at automated classification of such time series into health and disease.
The project will use machine learning methods to classify patterns in breathing and airway mechanics into health and disease (asthma and chronic obstructive pulmonary disease). The data has been collected from volunteers and patients at home. Potentially relevant features have been extracted, although there may be some scope for improvement. If successful, we will be able to work towards diagnosis of respiratory disease via a simple breathing test which can be performed at home. Furthermore, the determination of features which are meaningful for diagnosis will aid in our understanding of these diseases.

Revealing the development of the immune response through dynamic clustering
Co-supervised with: Uwe Roehm (SIT), Mark Read, Nick King and Thomas Ashhurst (Charles Perkins Centre)
Cutting edge technological advances in mass cytometry have enabled unprecedented detailed analyses of immune cells. This technology can measure around 40 parameters on each cell in a population of millions. However, computational analyses to make sense of the resultant high-dimensional data sets are lacking.

This project will explore how machine learning, in particular dynamic time-series cluster analysis, can bring meaning to these datasets. The techniques developed will show in fine detail how an immune response progresses, how sub-populations develop and evolve into one another, and how these sub-populations correlate with specific phases of disease.

This project will have access to a West Nile Virus (WNV) dataset to aid in development. WNV is a mosquito-borne disease that causes inflammation in the brain; it is lethal for some, and others never develop symptoms, yet the early immune response decisions underlying this disparity are unknown. The techniques developed here represent a key-enabling technology for mass cytometry, and are applicable to a wide variety of diseases beyond WNV; other data sets can be made available for this project if needed.

Projects supervised by Kevin Kuan

Emotional Contagion in Online Social Networks
The project is motivated by the recent controversial Facebook experiment on emotional contagion, in which Facebook manipulated the news feeds of nearly 700,000 users to examine if the emotion they expressed through messages on their news feeds influenced the emotion of other users as expressed in their subsequent posts (Kramer et al. 2014). To probe further into this Facebook experiment, this project aims to investigate factors affecting user posting behavior in online social networks using controlled experiment and/or secondary data (e.g., from Twitter). As an option, student may choose to experiment with electroencephalography (EEG) technology to collect brainwave data.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology and basic knowledge of Matlab for EEG data analysis; basic understanding in text mining.

Factors Affecting User Perception of Online Consumer Reviews
Consumers increasingly rely on online product reviews in guiding purchases. This project aims to study different review characteristics (length, tone, style, reviewer credibility, etc.) and their effects on consumer decision making using controlled experiment and/or secondary data (e.g., from Yelp.com). As an option, student may choose to experiment with electroencephalography (EEG) technology to collect brainwave data.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology and basic knowledge of Matlab for EEG data analysis; basic understanding in text mining.

Factors Affecting Consumer Behavior in Group-Buying
Group-buying sites, such as Groupon, Yahoo! Deals and LivingSocial, have emerged as popular platforms in social commerce that has received tremendous interest from both researchers and practitioners. Among the unique features of group-buying include the deep discount offered on the deals (typically 50% or more) and the limited availability of the deeply discounted deals (ranging from days to weeks). This project aims to study different group-buying characteristics and their effects on consumer behavior (e.g., purchase decision) using controlled experiment and/or secondary data (e.g., from Groupon). As an option, student may choose to experiment with electroencephalography (EEG) technology to collect brainwave data.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology and basic knowledge of Matlab for EEG data analysis; basic understanding in text mining.

Factors Affecting the Effectiveness of Mobile Personalisation
Unlike web personalisation, mobile personalisation is highly time and location sensitive. It also involves more complicated considerations among different imperfect recommendations. This project aims to examine different mobile personalisation strategies and their effects on consumer preference towards different imperfect recommendations using controlled experiment. As an option, student may choose to experiment with electroencephalography (EEG) technology to collect brainwave data.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology and basic knowledge of Matlab for EEG data analysis; basic mobile apps programming skills.

Information Overload in Online Decision Making
The negative consequences of information overload have been studied in a range of disciplines. Davis and Ganeshan (2009) showed that humans acquire and process relative more information under the threat of information unavailability. However, they are less satisfied with their decisions than those who acquire and process less information under no threat of information unavailability. This project extends the work on Davis and Ganeshan (2009) and investigates the impacts of information overload in the context of online decision making using controlled experiment. As an option, student may choose to experiment with electroencephalography (EEG) technology to collect brainwave data.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology and basic knowledge of Matlab for EEG data analysis.

Development of a Scale for Assessing eHealth Literacy in Australia
Co-supervised with Associate Professor Simon Poon
Description: The successful implementation of electronic health (eHealth) in Australia depends not only on the technological aspects of the solutions but also on the competencies of the users. This project aims to develop a measurement scale for assessing consumers’ perceived competence in using information technology for health and determining the fit between eHealth solutions and consumers in the Australian context.

Minimum requirements: Basic understanding in survey development and statistical analysis (e.g. ANOVA, regression, etc.).

Projects supervised by Na Liu

Information system design for aging population
Currently only 2% of the online population is 60 years or older. Despite the fast advancement of ICT technologies, most elderlies are still resistant to use them. The fact that older people’s vision, dexterity, memory and hearing all deteriorate with age, causes physical and cognitive limitations for the elderlies to use information technology. At the same time, older people usually feel uncomfortable trying new things or hesitant to explore.

In this project, you will review the literature on age difference in information processing, understand the cognitive constraints in elderlies, design and build a prototype/interface of physician appointment system for older people.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Online self-diagnosis
Many patients or people with health concerns go online to get health information for self-diagnosis. Some gather information from sources like “Yahoo answers” and make their own judgment, some use existing online self-diagnosis tools. What are the major factors influence Internet users’ choice of self-diagnosis tools are not systematically studied and whether the self-diagnosis result will lead to users’ subsequent participation in healthcare is unknown.

In this project, you will seek to understand the reasons why users choose different online self-diagnosis tools, design a prototype for an online diabetes or hypertension self-diagnosis tool, and conduct preliminary evaluation on the tools.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Design and implement a network-based social support system
Co-supervised by Dr. Na Liu, Associate Professor Simon Poon, Dr. Kevin Kuan

The Guardian Angel concept proposes the idea that a large, decentralised social support network can more effectively motivate individual in a connected community to collectively improve a population’s health and lifestyle habits than a traditional centralised system of a few localized hubs, e.g., health care professionals (e.g. clinicians), monitoring a large number of spokes (e.g. patients). The Guardian Angel idea is to organise patients, so that each individual has a guardian angel (or more than 1 guardians) from among the other patients (defined as children) and is assigned to be a guardian angel for someone else. The angel has the ability to engage with the other person's activities and then encourage and motivate the person to continue on a positive trajectory (to accomplish health goal). Every person tries to motivate someone and is motivated by someone else so that the health and lifestyle of the whole community improves. The question is how the dynamics of the system are dependent on features such as the size of the cohort, structure of the social network, randomness of events, and variation between individuals.

This project consists of four sub-projects:

  • Simulation Modelling: This part of the project involving using simulation methods to model the various aspects dynamics of the social systems, in particular the structure of the social networks; (main supervisor Kevin Kuan)
  • Mobile application development: This part of the project would involve developing a mobile application prototype to enable the Guardian Angel Network. One aspect is to identify functionalities to encourage and motivate, and the other aspect is about user interface design (main supervisor Na Liu)
  • Mathematical Modelling: This part of the project would involve simulating possible Guardian Angel systems using differential equations or agent-based models (main supervisor Simon Poon in collaboration with School of Mathematics & Statistics).

Relevant skills: Subject to the nature of the sub-projects

Projects supervised by David Lowe

Augmented reality remotely-accessed labs
Existing remote labs largely duplicate conventional experimental labs, but the computer interface provides an opportunity to enrich the experience of interacting with the equipment by using augmented reality approaches (imagine a magnetics experiment where the video image is overlayed to show the magnetic field lines). This project involves developing the software interfaces for an existing remote laboratory in order to provide an illustrative prototype. The prototype will demonstrate the benefits that can be achieved through the use of augmented reality technologies.

Architecture for collaborative remotely-accessed labs
The leading remote labs software management system – Sahara – has been designed to be consistent with multi-student distributed collaboration, but this functionality has not yet been fully explored or implemented. This project will investigate extending Sahara to incorporate distributed student collaboration within an experiment session.

Remote control of physical systems: the role of context
It is becoming increasingly common to use remote access to control physical systems. For example, researchers within the Faculty have been exploring remote and automomous control of Mars Rovers, mining equipment, teaching laboratory apparatus and fruit picking robots. This project will focus on the role of contextual information in supporting engagement and learning in these systems.

Projects supervised by Simon Poon

Analytical models for Causal Complexities
Understanding complex interactions amongst multiple study factors is central for a range of academic disciplines. Changing one factor may have little effect on study outcome if other factors remain unchanged. Confounding may arise through interactions of study factors. Such interactions may extend to different configurations with complex interactions amongst many seemingly unrelated factors. Traditional Chinese Medicine (TCM) is a good example, Similar to gene expression analysis, the number of herbs used is relatively large and the number of cases is relatively small, i.e. cases to high dimensionality. Data mining techniques are often employed to extract pattern from clinical herbal prescription records, but this approach has its unique challenges in handling the issues of confounding. Conventional dimension reduction techniques used in data mining may not be adequate when the relationships of data are highly interactive. Removing important herbs may lead to severe confounding in the analysis.

The aim of this project is to develop an appropriate heuristics to derive suitable causal models to assess of the strength of interactions on study outcome. The application can be used to determine the level of effect is changed under the certain conditions (or contingent) of other studies factors, i.e. to facilitate good understanding of the diverse interactions and interrelatedness among the study factors and how they impact on certain diseases. The project will include the development and implementation of a data mining algorithm as well as to discover interaction patterns from given high dimensional datasets.

A Framework for Health Technology Evaluation (in collaboration with Faculty of Health Sciences)
Many issues have been identified that are of potential relevance for planning, implementation and execution of an evaluation study in the health information systems. This project requires student to systematically review relevant literature and develop a preliminary framework to address issues covering all phases of an evaluation study from study design, planning and implementation of technology to evaluation of impacts.

The project is to develop better insights leading to a higher quality of evaluation studies for health technology solutions

Relevant skills: Health Informatics Evaluation, Health Information Systems and Health Economics

A framework to model organizational Complexities in IT business value study
In order to make effective use of IT, managers require a body of knowledge and a set of evidence-based practices that can enable them to make the correct IT resource allocation decisions relating to the appropriate mix of IT inputs, their interactions, and the complementary investments needed in organisational processes. The significance of the research aims to provide such finely-grained and focussed guidelines for deciding on candidate technologies and for developing effective organisational strategies. The analytical approach developed in this project can be used to provide deeper insights into the interrelatedness of multiple factors beyond the extant IT business value research and also potentially to a wide range of disciplines in explaining complex interactions among multiple organisational factors. In this project, we will use various computational approaches to study complex organisational configurations in relation to business value impacts of IT

Skills Requirements: Good knowledge in Information Systems Concepts, Complex Engineering Design, Statistics

Projects supervised by Bernhard Scholz

Datalog Evaluation Strategies Based on Learning

Efficient Data-Structures for Logical Relations

Magic Set Transformation in Soufflé

Automatic Deduction of Complexity Classes in Soufflé

Components in Soufflé

Infinite Lattices in Soufflé

Benchmarking and Test Harness in Soufflé

Parallelization of Stream Programs

Projects supervised by Xiuying Wang and Professor E Chang

Expected skills and capabilities:

  • Proficiency in a programming language
  • Proficiency in database
  • Knowledge about visualisations (dashboards)
  • Knowledge about logs (such as website logs, application logs, or network logs)
  • Knowledge about general statistical methods to produce the analysis of the results.

Intelligent recommender
Design and implement a recommender system for a decision support platform, define the system parameters and rules required for implementation. The solution needed for a warehouse where the senior manager takes decisions (buy/repair/transfer) based on previous decisions outcomes taken by previous managers. Also define the parameters required to recommend suppliers and repairers pathways to answer the question of (what is the best pathway that should be taken to deliver/send this part?), where the parts should be purchased or get repaired. Use case based reasoning (CBR) so that the recommended decision will be based on previous cases.

Cognitive Decision System based on Behavioural Mining
User behaviour has proven to be one of the key factors in understanding and enhancing decision making process. By understanding the decision maker’s behaviour while making decisions, the future decision process can be improved. This project is aimed at designing and implementing a user behaviour tracking system, which can determine the thought (cognitive) processes of decision makers.

Dynamic drop/creation of database views
Tracking user behaviour has proven to be very important factor in providing personalized and efficient user interface to the users. The aim of this project is to track each user to find out which gadget or data they mostly utilize during their session on the dashboard. This information will then be used to drop the unwanted gadgets from the screen, as well as the database views on the backend that support those gadgets. The usage information will also be used to modify/create new gadgets that the users seem to need, and accordingly the supporting database views are to be created dynamically. The initial implementation of the dashboard will be provided, along with supplier data warehouse.

Interactive Decision-Making Dashboard
An organisation is planning to migrate several existing widgets from different web applications into one dynamic dashboard in order to simplify and speed up their decision making process. The widgets of the current web applications are all generated out of tables from different database management systems (DBMS). The aim of the project is to integrate all those tables into widgets on one single dashboard view.

As the amount of information (tables) available exceeds the space for a single page dashboard, it is essential that the user has the option to select which tables he wants see or not on his dashboard. In addition, not all information in one table (columns) is necessarily relevant for a user and due to that he should have the possibility to hide or show columns and with that obtain more space for other widgets. To furthermore simplify the view for a user, an option should be given to transform numeric data into a graphical representation (diagrams) and backwards to a table format depending on his needs. In addition he should have the option to generate several different widgets from one table.

The website has to be implemented for only one user (no user account management); however another requirement is that the dashboard view of a user has to look exactly the same as the last time he was working on it. In a future scenario, the organisation wants to learn from the behaviour of their users to propose dashboards/widgets to people in the same domain - this should be considered in the design and implementation of the dashboard.

Projects supervised by Zhiyong Wang and David Feng

Multimedia Data Summarization
Multimedia data is becoming the biggest big data as technological advances have made it ever easier to produce multimedia content. For example, more than 300 hours video is uploaded to Youtbue every minute. While such wealthy multimedia data is valuable for deriving many insights, it has become extremely time consuming, if not possible, to watch through a large amount of video content. Multimedia data summarization is to produce a concise yet informative version of a given piece of multimedia content, which is highly demanded to assist human beings to discover new knowledge from massive rich multimedia data. This project is to advance this field by developing advanced video content analysis techniques and identifying new applications.

Predictive Analytics of Big Time Series Data
Big time series data have been collected to derive insights in almost every field, such as the clicking/view behaviour of users on social media sites, electricity usage of every household in utility consumption, traffic flow in transportation, to name a few. Being able to predict future state of an event is of great importance for effective planning. For example, social media sites such as Youtube will be able to better distribute popular video content to their caching servers in advance so that users can start watching the videos with minimal delay. This project is to investigate existing algorithms and develop advanced analytic algorithms for higher prediction accuracy.

Human Motion Analysis, Modeling, Animation, and Synthesis
People are the focus in most activities; hence investigating human motion has been driven by a wide range of applications such as visual surveillance, 3D animation, novel human computer interaction, sports, and medical diagnosis and treatment. This project is to address a number of challenge issues of this area in realistic scenarios, including human tracking, motion detection, recognition, modeling, animation, and synthesis. Students will gain comprehensive knowledge in computer vision (e.g. object segmentation and tracking, and action/event detection and recognition), 3D modeling, computer graphics, and machine learning.

Projects supervised by Bing Zhou

Computing and visualising transcription factor interaction networks
The gene expression of the cells is regulated by the binding and interaction of various transcription factors (TFs). With the enormous amount of ChIP-sequencing (ChIP-seq) data generated for TFs by projects such as ENCODE and modENCODE, it is possible to reconstruct the TF interaction networks by correlating and modelling the binding profiles of TFs.

We will be developing high-performance computing algorithm to dynamically computing correlations among a large set of TF binding profiles. The correlation matrix will then be used to reconstruct the TF interaction networks. We also aim to develop a graphical user interface to display the TF interaction networks by methods such as clustering.

In this project the students will get involved in 1: Algorithm design, implementation and testing on multicore computers and clusters of PCs; and 2: An interactive graphical user interface design and implementation.

Requirements: good programming skill (essential) and experience in graphical user interface development (desirable).

Projects supervised by Albert Zomaya

Addressing Interoperability in the Internet of Things
The advances on electronic devices, wireless communications, RFID technology, and the explosive growth of the World Wide Web contributed to leverage the development of the Internet of Things (IoT) paradigm. In the IoT vision, every object on Earth can be found and used via Internet. Interoperability is one of the major focuses in IoT to provide the foundation of interaction between human-to-machine(H2M) and machine-to-machine(M2M). There are different perspectives of interoperability to be addressed in IoT systems, such as technical, syntactic and semantic. Technical interoperability concerns the integration, mainly through communications protocols, of the myriad of heterogeneous physical devices and systems that support M2M interactions. Syntactic interoperability focuses on the formats of the data sent in the payload of the messages exchanged by the devices. Such data needs to conform to a well-defined formatting and encoding so they can be consumed by all stakeholders involved in the communication. At a higher level, the semantic Interoperability plays the role of ensuring that the actors involved in the communication have a common understanding not only of the format but of the content of the exchanged messages.

The project is aimed at improving the provision of technical and syntactic interoperability by implementing support to COAP (Constrained Application Protocol). COAP is a protocol developed to allow simple electronic devices (with limited computational and energy capabilities) to be able to communicate over the Internet while taking into account four main factors: (i) battery, (ii) latency, (iii) transmission capability, and (iv) mobility. In the project we intend to add support for semantic interoperability by adopting ontologies specifically tailored for IoT environments.

Computation Offload in Fog Computing
Fog computing is a newly introduced concept as an extension of cloud computing to provide computing, storage and networking services in between end devices and traditional cloud computing data centres. It aims at providing the QoS requirements, such as: mobility support, geographical distribution, location awareness and low latency that required by the internet of things (IoT)’ applications but the traditional cloud computing is failed to address. To offload the computation workload in fog devices can overcome the resource constraints on the edge devices so as to saving storage and increasing battery lifetime, and more importantly providing better QoS requirements to IoT applications.

This projects aims at dealing with the dynamic of computation offload in fog devices. The dynamic has three fold 1) the fog device accessing is highly dynamic, 2) the fog device availability is highly dynamic, and 3) the fog device resource is highly dynamic. With such dynamics, questions like what kind of granularity to choose for offloading at different hierarchy of fog and cloud, how to dynamically partition application to offload on fog and cloud and how to make offloading decisions to adapt dynamic changes in network, fog devices and resources are eager to answer.

Handling Big Data-Streams in the Internet of Things
In the Internet of Things (IoT), with billions of nodes capable of gathering data and generating information, the availability of efficient and scalable mechanisms for collecting, processing, and storing data is crucial. The number of data sources, on one side, and the subsequent frequency of incoming data, on the other side, create a new need for Cloud architectures to handle such massive flows of information, thus shifting the Big Data paradigm to the Big Stream paradigm. Moreover, the processing and storage functions implemented by remote Cloud-based collectors are the enablers for their core business, which involves providing services based on the collected/processed data to external consumers. Several relevant IoT scenarios, (such as industrial automation, transportation, networks of sensors and actuators), require real-time/predictable latency and could even change their requirements (e.g., in terms of data sources) dynamically and abruptly. Big Stream-oriented systems could react effectively to changes and provide smart behaviour for allocating resources, thus implementing scalable and cost-effective Cloud services. Dynamism and real-time requirements are another reason why Big Data approaches, due to their intrinsic inertia (i.e., Big Data typically works with batch-based processing), are not suitable for many IoT scenarios. The Big Stream paradigm allows performing real-time and ad-hoc processing in order to link incoming streams of data to consumers, with a high degree of scalability, fine-grained and dynamic configuration, and management of heterogeneous data formats. In brief, while both Big Data and Big Stream deal with massive amounts of data, the former focuses on the analysis of data, while the latter focuses on the management of flows of data. This characteristic has an impact also on the data that are considered relevant to consumer applications. For instance, while for Big Data applications it is important to keep all sensed data in order to be able to perform any required computation, Big Stream applications might decide to perform data aggregation or pruning in order to minimize the latency in conveying the results of computation to consumers, with no need for persistence.

The goal of this project is to investigate data centric architectures and algorithms to efficiently manage Big Stream data flows in the Internet of Things.

Task Scheduling Algorithms for the Internet of Things
Applications for the Internet of Things are typically composed of several tasks or services, executed by devices and other Internet available resources, including resources hosted in a cloud platform. The complete execution of an application in such environments is a distributed, collaborative process. To enable collaborative processing of applications, the following problems must be solved: (i) assigning tasks to devices (and other physical or virtual resources), (ii) determining the execution sequence of tasks, and (iii) scheduling communication between involved devices and resources. Unlike traditional task scheduling, with the involvement of sensors and actuators, the types of tasks in IoT applications are more than computation and communication. Different allocation of these tasks on the nodes consumes different amounts of resources from the limited devices and provides different quality of service (QoS) to the applications. Although QoS management in traditional distributed and networked system is a well-studied topic, in IoT this is still a poorly investigated subject and the definition of QoS in IoT is still not clear. The traditional QoS attributes such as throughput, delay, or jitter are not suitable in IoT, where additional attributes are concerned, such as for instance the information accuracy (that is qualified with the probability that an accuracy can be reached), and the network resources required. Therefore, a task scheduling algorithm should handle the efficient execution of a large number of applications in the IoT infrastructure considering multiple types of resources and different QoS parameters, specific to IoT environments. Moreover, such an algorithm must consider situations in which multiple applications can perform common tasks (such as the sensing of the same physical variable), that do not need to be performed several times by the devices and physical resources, thereby saving energy. Still, there may be applications with higher priorities in relation to the required response time (e.g., critical time applications) or in relation to the amount of resources provided to them (e.g., bandwidth, sensing coverage), compared to others that are sharing the IoT infrastructure and therefore these priorities must be respected when allocating tasks.

This project aim to study existing task scheduling algorithms, especially those for cloud computing and wireless sensor network environments, and to propose a new one, specifically tailored for IoT environments.

Virtualization of Wireless Sensor Networks
In the last few years, we have witnessed the emergence of a new paradigm, called the Cloud of Things (CoT). The CoT paradigm emerged from the combination of the paradigms of Cloud Computing and Internet of Things (IoT). Essentially, in the CoT paradigm, the Cloud acts as an intermediate layer between smart things and applications. Such an intermediate layer hides the complexity of smart things necessary to implement applications and allows that IoT systems can benefit from the virtually unlimited resources of a Cloud to store the huge amount of data produced by the interconnected device and to implement sophisticated processing and data analytics. Similarly, the Cloud can benefit from the IoT by extending its scope to deal with real world objects (smart things) in a distributed and dynamic way. Among the highly heterogeneous set of smart things composing CoT environments, smart sensors and actuators play an important role. Smart sensors and actuators present some specific features that need to be considered when integrating them with the Cloud, namely: (i) such devices have a lower degree of heterogeneity and mobility in comparison to some devices typical of IoT, such as wearable sensors or field operation devices; and (ii) such devices are more resource constrained, specifically in terms of the available energy for operating. The potential advantages brought by wireless sensor and actuator networks (WSAN) along with the specific features of such devices have motivated the emergence of the Cloud of Sensors (CoS) paradigm [1], as a type of ecosystem within the broader domain of CoT.

A CoS is composed of virtual nodes built on top of physical WSAN nodes, and provides support to several applications which, in turn, may require access to functionalities at the Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) levels. Application owners can automatically and dynamically provision such virtual nodes on the basis of applications requirements. In this sense, the CoS infrastructures are built centred on the concept of wireless sensor network virtualization, which is expected to provide a clean decoupling between services (required by applications) and infrastructure (encompassing sensors, actuators and the cloud).

In this context, the main goal of this project is to propose a new model for wireless sensor network virtualization. By pursuing this we aim to answer two research questions: (i) How to virtualize devices (sensors, actuators and other physical objects) and networks of such devices, and (ii) How to properly separate the responsibilities between the components of a CoS (devices, applications and cloud platform). For doing so, we will initially investigate existing conceptual models for the integration of physical devices with the cloud and propose (or adopt) one the fits the specific requirements of CoS. The conceptual layers of the proposed model will be defined, their responsibilities and the interactions among the envisioned components. The proposed model will promote loose coupling between the layers in order to cope with the system scalability, and to promote a clear separation of responsibilities between the layers, isolating issues related to the (sensing and communication) infrastructure from issues linked to the application.

[1] Igor Leão dos Santos, Luci Pirmez, Flávia Coimbra Delicato, Samee Ullah Khan, Albert Y. Zomaya: Olympus: The Cloud of Sensors. IEEE Cloud Computing 2(2): 48-56 (2015)

The following projects are in collaboration with Dr Mahendra Piraveenan (Civil Engineering).

Corporate networks and cascading failures: how shareholding patterns are influential in potential financial meltdowns
The modern corporate world can be modelled as a complex network, where many organizations are major shareholders in other organizations. Therefore, the liquidation of one organization often has a cascading effect, and may lead to a national or global crisis. For example, the collapse of Lehmann brothers led to such a cascading effect in 2008 and was considered as one of the triggers for the global financial crisis that year.

By modelling companies as nodes and shareholding interests as links, it is possible to predict which are the most important organizations for the global financial stability. These may not necessarily be the largest or richest organizations, but organizations which hold strategic places in the corporate shareholding networks. Furthermore, by applying the well known ‘cascading failure’ model, commonly used, for example, in power systems design, it will be possible to envisage a set of triggers / scenarios which can again cause a global financial crisis. This project will aim to do this by first putting together a corporate network based on shareholding patterns and then analysing it. The project will be of much applied interest to corporations as well as economists and government policy planners.

The student undertaking this project is expected to have good fluency in programming (preferably in java but this is not a must), and will learn to use graph theory and complex systems theory to analyse and make predictions about real world financial systems. The student will also develop data gathering skills, and will need to often write scripts and macros for gathering data from potential sources. There will be several opportunities for inter-disciplinary collaboration in this project, and the project has great potential to become a sustained line of research in case the student opts for a research career - at the same time, the project will be of great interest to industry, given the implications to the economy.

Placement matters in making good decisions: understanding how you can improve your cognitive decision making abilities by re-positioning yourself in your network
Have you seen the movie A Beautiful Mind, starring Russell Crowe? It was about the scientist John Nash who, though schizophrenic, had a beautiful mind and made major contributions to game theory, for which he was awarded the Nobel Prize in 1994. The field of game theory has since made rapid advancements and contributed to economics, computer science, social science and other fields. Game theory is used to model cognitive decision making scenarios, where you compete with other people for resources or profit, and you need to make optimal decisions without knowing what decisions the other ‘players’ will make. For example, if you are in the property market, the offers you will make for a property will depend on what you think other house-buyers are likely to offer.

In real world though, not every member of a social system is involved in the same number of ‘games’, and the number of ‘games’ each player plays depends on the underlying contacts each player maintains or opportunities each player has in the system. That is, games are played along the links of the underlying social networks, which is modelled by networked game theory. In a social scenario, links are associated with costs and time, so that the number of links each person can maintain is limited. Certain people are best positioned within the network to gain maximum profit compared to others, depending on their topological placement. This work will analyse how a person can maximize their profit by using their social links wisely and place themselves strategically within their networks. Even though it will initially be an abstract study, the concepts developed will be of much relevance to several industries, including the corporate world, the real estate market, and even for preventing the spread of virus and other contagions in computer networks.

The student undertaking this project is expected to have strong programming skills (preferably in java but this is not a must), decent analytical skills and fluency in basic mathematics. There will be several inter-disciplinary collaboration opportunities both within the University of Sydney and internationally.

Is the society preventing you from getting married? Cracking the matching problem using network theory
Have you ever felt left out of a partner because all the people that you know have been ‘taken’ by others? This can happen in many situations in life, for example when you are trying to find a team for a group assignment or lab work, when you are attempting to share your flat with a friend, or indeed, when you are trying to find a life partner. In all these scenarios, obviously those people who have many contacts find it easier to get hold of a partner, whereas people who have few contacts struggle to find one. If everyone in the community had the same number of contacts then everybody would have equal probability of success, but this is usually not the case.

Contact patterns between people or agents can be modelled as networks. A network where there are a few people with a high number of contacts and many people with fewer contacts are called ‘scale-free’ networks and most social networks obviously fall into this category. The contact patterns of a network can be mathematically quantified by the ‘degree distribution’ of the network, and the level of scale-freeness can be measured from this degree distribution. For a given social network, the maximum number of matchings it can have is limited by its topology. If there are several leaf nodes attached to a few hubs, each time a hub is ‘taken’ (matched with another node), all other leaf nodes attached to it lose the chance to find a matching. Determining the maximal number of matchings a network can have is called the ‘matching problem’ in graph theory and mathematics.

In this project, you will endeavour to understand the theoretical relationship between the ‘scale-freeness’ of networks (communities) and the number of maximal matchings possible in each community. In other words, you will demonstrate how the network structure of the community is responsible for some nodes being ‘left out’ without partners. This will be done mainly through simulation of matchings in networks using computer programmes. Even though the project is theoretical at the outset, it has obvious practical relevance and is likely to be a long term research avenue for students who are interested in further research.

The student undertaking this project is expected to have strong programming and analytical skills, and a sound basic knowledge in mathematics and statistics.