Grace Kobusinge1,2, 1Gothenburg University, Gothenburg, Sweden and 2Makerere University, P.O. Box 7062 Kampala, Uganda
Due to their renowned great information processing and dissemination power, Health information systems (HIS) can readily avail past patient medical information across the continuum of care in order to facilitate on going treatment. However, a number of existing HIS are designed as vertical silos with no interoperability onuses and therefore, cannot exchange patient information. At the same time little knowledge is known about the intricacies & factors that surround HIS’ interoperability implementations. This study therefore, employs an institutional lens perspective to investigate contextual factors with an impact on HIS’ interoperability designing. Through this perspective the following seven contextual factors were arrived at: institutional autonomism, intended system goals, existing health-information-systems, national HIS implementation guidelines, interoperability standards, policy and resources. A further study implication is the use of institutional lens in making sense of the systems’ context of integration in order to discover salient factors that might impact Health-information-systems’ interoperability designing.
Health Information Systems’ Interoperability, Contextual Factors, Health Information Systems’ Designing.
Nwanneka Eya1 and Rufai Ahmad2, 1Department of Computer and Information Science, University of Strathclyde, Glasgow, Scotland and 2University of Strathclyde, G1 1XH, Glasgow, Scotland, United Kingdom
Enterprise information systems plays an important role in manufacturing companies by integrating the firm’s information, operating procedure and its functions in all department; resulting in a better operation in the global business environment. In developing countries like Nigeria, most manufacturing firms have been facing the need to compete efficiently in the global markets. This is because of the Nigerian dynamic business environment that continues evolving and the enormous government support for indigenous manufacturers. Therefore, the need for an enterprise information system cannot be underemphasized, but because an enterprise information system is a major investment that is expensive and time consuming; the need to assess if a company is ready for such a major transition becomes very important. In assessing the readiness of Nigerian manufacturing companies for ERP implementation, thereare many factors to consider. This study assesses the readiness level of Nigerian manufacturing companies’ base on the survey responses from a wide spectrum of Nigerian manufacturing firms. The findings showed that “readiness level” are mainly influenced by technological , organizational and environmental factors which basically involved assumed benefits, assumed difficulty, technological architecture, technological skills, competitive pressure, organization size and information management priority. It was observed that technological factors had more impact in determining the readiness level of any firm. This paper suggests a structure or standard that Nigerian manufacturers could use to ascertain their company’s readiness level before embarking on an investing in enterprise information system.
Enterprise information system, Readiness analysis, Nigeria, Manufacturing, Company.
Xiang Ao and Shuaicheng Li, Department of Computer Science, City University of Hong Kong, Hong Kong
Bladder cancer (BC) is one of the most globally prevalent diseases, attracting various studies on BC relevant topics. High-throughput sequencing renders it convenient to extensively explore genetic changes, like the variation in gene expression, in the development of BC. In this study, we did differential analysis on gene and transcript expression (DGE and DTE) and differential transcript usage (DTU) analysis in an RNA-seq dataset of 42 bladder cancer patients. DGE analysis reported 8543 significantly differentially expressed (DE) genes. In contrast, DTE analysis detected 14350 significantly DE transcripts from 8371 genes, and DTU analysis detected 27914 significantly differentially used (DU) transcripts from 8072 genes. Analysis of the top 5 DE genes demonstrated that DTE and DTU analysis provided the source of changes in gene expression at the transcript level. The transcript-level analysis also identified some DE and DU transcripts from previously reported mutated genes that related to BC, like ERBB2, ESPL1, and STAG2, suggesting an intrinsic connection between gene mutation and alternative splicing. Hence, the transcript-level analysis may help disclose the underlying pathological mechanism of BC and further guide the design of personal treatment.
Bladder Cancer, Differential Gene Expression, Differential Transcript Expression, Differential Transcript Usage
Shaodi Li1, Junmin Wu2, Yi Zhang3 and Yawei Zhou4, 1School of Software Engineering, University of Science and Technology of China, Suzhou, China and 2,3,4School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Communication between parallel programs is an indispensable part of parallel computing. SW26010 is a heterogeneous many-core processor used to build the Sunway Taihu Light supercomputer, which is well suited for parallel computing. Our team is designing and implementing a coroutine scheduling system on the SW26010 processor to improve its concurrency, it is very important and necessary to achieve communication between coroutines for the coroutine scheduling system in advance. Therefore, this paper proposes a communication system for data and information exchange between coroutines on SW26010 processor, which contains the following parts. First, we design and implement a producer-consumer mode channel communication based on ring buffer, and designs synchronization mechanism for condition of multi-producer and multi-consumer based on the different atomic operation on MPE (management processing element) and CPE (computing processing element) of SW26010. Next, we design a wake-up mechanism between the producer and the consumer, which reduces the waiting of the program for communication. At last, we test and analyse the performance of channel in different numbers of producers and consumers, draw the conclusion that when the number of producers and consumers increases, the channel performance will decrease.
Coroutine, SW26010, Many-core, Parallel Communication, Synchronization
Sukhwan Jung, Rachana Reddy Kandadi, Rituparna Datta, Ryan Benton, and Aviv Segev, Department of Computer Science, University of South Alabama, Mobile, Alabama
Technological developments are not isolated and are influenced not only by similar technologies but also by many entities, which are sometimes unforeseen by the experts in the field. The authors propose a method for identifying technology-relevant entities with trend curve analysis. The method first utilizes the tangential connection between terms in the encyclopedic dataset to extract technology-related entities with varying relation distances. Changes in their term frequencies within 389 million academic articles and 60 billion web pages are then analyzed to identify technology-relevant entities, incorporating the degrees and changes in both academic interests and public recognitions. The analysis is performed to find entities both significant and relevant to the technology of interest, resulting in the discovery of 40 and 39 technology-relevant entities, respectively, for unmanned aerial vehicle and hyperspectral imaging with 0.875 and 0.5385 accuracies. The case study showed the proposed method can capture hidden relationships between semantically distant entities.
Technology Forecasting, Trend Curve, Big Data, Academic Articles, Web Pages
Krikor Maroukian1 and Stephen R. Gulliver2, 1Microsoft, Kifissias Ave., Athens, Greece and 2University of Reading, Henley Business School, Business Informatics Systems and Accounting, Reading, UK
The contribution emphasizes research undertaken in highly structured software-intensive organizations and the transitional challenges associated to agile, lean and DevOps practices and principles adoption journeys. The approach undertaken to gain insights to research questions resulted in data collected, through a series of interviews, by thirty (30) practitioners from EMEA region (Czech Republic, Estonia, Italy, Georgia, Greece, The Netherlands, Saudi Arabia, South Africa, UAE, UK) working in nine (9) different industry domains and ten (10) different countries. A set of agile, lean and DevOps practices and principles that organizations are choosing to include in their adoption journeys towards DevOps-oriented structures is identified. The most frequently adopted practices of structured service management that can contribute to the success of DevOps practices adoption are also identified. Results also indicate that software development and operations roles in DevOps-oriented organizations can benefit from existing highly structured service management approaches.
Agile, Lean, DevOps, IT Service management, Software Development Practices and Principles
Ngan-Khanh Chau1 and Truong-Thanh MA2, 1An Giang University, VNU-HCM, Vietnam and 2Soc Trang Community College, Vietnam
Preserving and promoting the intangible cultural heritage is one of the essential problems of interest. In addition, the cultural heritage of the world has been accumulated and early respected during the development of human society. For the preservation of the traditional dances, this paper is one of the significant processed step in the sequence of our research in order to build an intelligent storage repository that would help to manage the large-scale heterogeneous digital contents efficiently, particularly in dance domain. We concentrated on classifying the fundamental movements of Vietnamese Traditional Dances (VTDs), which is the foundation of automatically detecting the motions of the dancer’s body parts. Moreover, we also proposed a framework to classify the basic movements through coupling a sequential aggregation of the Deep-CNN architectures (to extract the features) and Support Vector Machine (to classify the movements). In this study, we detect and extract automatically the primary movements of VTDs, we then store the extracted concepts into an ontology-base served for reasoning, query-answering, and searching the dance videos.
Vietnamese Traditional Dance, Deep learning, Support vector machine.
Merve Duman, Roya Choupani and Faris Serdar Tasel, Department of Computer Engineering, Cankaya University, Ankara, Turkey
In this study, an automatic identification method of antibiogram analysis is implemented, existing methods are investigated and results are discussed. In an antibiogram analysis, inhibition zones of drugs read by humans might be measured with some mistakes. These mistakes such as misreading during the analysis process or the conditions like imperfect or partial seeding inhibition zones can be solved with automatic identification methods. Also, there is a need for periodically reading or a tracking system because inhibition zones change with time. To overcome antibiogram analysis problems, some improvements are made on the image. As pre-processing operations, Otsu Thresholding, largest object finding, binary image mask, morphological erosion and closing operations are applied. Circular Hough Transform is used to find drugs and profile lines are drawn to find inhibition zones. The Otsu thresholding is used to determine the zone borders. The results obtained from the algorithm are evaluated and discussed.
Antibiogram Analysis, Image Processing, Feature Extraction, Object Detection, Image Segmentation
Firas Gerges and Frank Y.Shih, Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA
Acne and Rosacea are two common skin diseases that affect many people in United States. These two skin conditions can result in similar symptoms, which leads patients to mistake their case. In this paper, we aim to develop a model that can efficiently differentiate between these two skin conditions. A deep learning model is proposed in order to automatically distinguish Rosacea from Acne cases using infected skin images. Using image augmentation, the size of the original dataset can be enlarged. Experimental results show that our model achieves a very high performance with an average testing accuracy of 99%.
Acne, Rosacea, Deep Learning, Image Processing, Convolutional Neural Networks
Zhi Yang1 and Youhua Yu2, 1School of Humanities, Jinggangshan University, Ji’an, P.R.China and 2Matrix Technology(Beijing Ltd.), Daxing, Beijing, P.R.China
It has been a fascinating idea in the academic world( especially in our computer vision community), of fusing data from optical cameras with those from other physical sensors to form a uniform surface. Due to realistic limits, research on the subject had endured a long and arduous journey, and would have a much more obscure perspective, if it were not for breakthrough in way of shape recovery. Benefiting from the advancement, in this paper, we propose a updated version of Luminance Integration(LI) method. The key achievement in our work, is introduction of Spectrophometric Analysis(SA), which handles luminance/intensity values in a fully-justified reflectance spectroscopic fashion, to resolve confusion brought from colorimetric models and photoelectric equipment. In addition to this, a framework of statistical spatial alignment is used for data fusion, where geometrical and semantic inferences are given. Particularly, the process of alignment generally starts with a series of spatial transformations, based on assumption that the transformations are able to diminish unwanted variance components. What’s more, this voxel-based analysis all derive from the same rigid body of an object, so the Magnitude of Relative Error (MRE), caused by regionally-specific effects from frame of reference, can be reduced to the minimum. In the end, in an extensive series of experiments, we carefully evaluate parametric models we have been constructing, for the purpose of refining the shape alignment with proper deterministic and semantic inferences. Our results show that our method effectively improve the accuracy of shape recovery and the “on-line performance” of matching correspondent spatial data, especially those from optical cameras and depth/distance sensors.
Spectrophotometric Analysis, Data Fusion, Shape Recovery, Point Cloud, Surface Alignment
Ankit Kamboj and Nilesh Powar, Advanced Analytics Team, Cummins Technologies India Pvt. Ltd, Pune, India
Safety is of predominant value for employees who are working in an industrial and construction environment. Real time Object detection is an important technique to detect violations of safety compliance in an industrial setup. The negligence in wearing safety helmets could be hazardous to workers, hence the requirement of the automatic surveillance system to detect persons not wearing helmets is of utmost importance and this would reduce the labor-intensive work to monitor the violations. In this paper, we deployed an advanced Convolutional Neural Network (CNN) algorithm called Single Shot Multibox Detector (SSD) to monitor violations of safety helmets. Various image processing techniques are applied to all the video data collected from the industrial plant. The practical and novel safety detection framework is proposed in which the CNN first detects persons from the video data and in the second step it detects whether the person is wearing the safety helmet. Using the proposed model, the deep learning inference benchmarking is done with Dell Advanced Tower workstation. The comparative study of the proposed approach is analysed in terms of detection accuracy (average precision) which illustrates the effectiveness of the proposed framework.
Safety Helmet Detection, Deep Learning, SSD, CNN, Image Processing
Hritam Basak and Sreejeet Maity, Electrical Engineering Department, Jadavpur University, Kolkata, India
In this paper we try to propose an automated method for cobb angle computation from radiograph (x- ray) images of scoliosis patients where the objective is to have increased reliability of spinal curvature quantification. The automatic technique mainly comprises of four steps, namely: pre-processing (denoising and filtering), region of interest (roi) identification, feature extraction and cobb angle computation from the extracted spine centre-line.svm (support vector machine) classifier is used for object identification and feature extraction. It is assumed that spine is a continuous structure instead of a series of discrete vertebral bodies with individual orientations. Several methods are used for the identification of centre-line of spine: morphological operation, gaussian blurring and polynomial fit. Now tangents are taken at every point of the extracted centre-line and thus we can evaluate the cobb angle from these sets of tangents. For the analysis of the automated diagnosis process, the approach was evaluated on the basis of 25 coronal x-ray images. Region of interest identification which is based on svm classifier is effective enough at a specificity of 100% and approximately 58% results in the extraction of centre-line from this roi were accurate where the angular variability is very less or negligible. Due to poor radiation dose and several other reasons, the endplates and edges of spine in radiograph images were blurred and hence the continuous contour based approach gives better reliability.
Scoliosis, Cobb Angle, CADx
Tao Yan1, In-Ho Ra2, Shaojie Hou1, Jingyu Feng1, and Zhicheng Wu2, 1School of Information Engineering, Putian University, Putian, China and 2School of Computer, Information and Communication Engineering, Kunsan National University, Gunsan, South Korea
The joint video expert group proposed a JMVM reference model for multi-view video coding, but the model did not give an effective rate control scheme. This paper proposes a rate control algorithm for multi-view video coding (MVC) based on correlation analysis. The core of the algorithm is to first divide all images into six types of coded frames according to the structural relationship between disparity prediction and motion prediction, which improve the binomial rate distortion model, and then perform the analysis between different views based on similarity analysis. The bit rate control is divided into a four-layer structure for bit rate control of multi-view video coding. Among them, the frame-layer code rate control considers the layer B frame and other factors to allocate the code rate, and the basic unit-layer code rate control uses different quantization parameters according to the content complexity of the macroblock. The average error between the actual bit rate and the target bit rate of the bit rate control algorithm can be controlled by 0.94%.
Multi-view video coding, Rate control, Bit allocation, Similarity analysis, Basic unit layer
Ammar K Alazzawi1*, Helmi Md Rais1, Shuib Basri1, Yazan A. Alsariera4, Abdullateef Oluwagbemiga Balogun1,2, Abdullahi Abubakar Imam1,3, 1Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Bandar Seri Iskandar 32610, Perak, Malaysia, 2Department of Computer Science, University of Ilorin, PMB 1515, Ilorin, Nigeria, 3Department of Computer Science, Ahmadu Bello University, Zaria, Nigeria and 4Department of Computer Science, Northern Border University, Arar 73222, Saudi Arabia
t-way interaction testing is a systematic approach for exhaustive test set generation. It is a vital test planning method in software testing which generates test sets based on interaction amongst parameters to cover every possible test sets combinations. t-way strategy clarifies the interaction strength between the number of parameters. However, there are some test sets combinations that should be excluded when generating the final test set as a result of invalid outputs, impossible or unwanted test sets combinations (e.g. system requirements set). These types of set combinations are known as constraints combinations or forbidden combinations. From existing studies, several t-way strategies have been proposed to address the test set combination problem, however, generating the optimal test set is still open research being an NP-hard problem. Therefore, this study proposed a novel hybrid artificial bee colony (HABC) t-way test set generation strategy with constraints support. The proposed approach is based on a hybrid artificial bee colony (ABC) algorithm with a particle swarm optimization (PSO) algorithm. PSO was integrated as the exploratory agent for the ABC hence the hybrid nature. The information sharing ability of PSO via the Weight Factor is used to enhance the performance of ABC. The output of the hybrid ABC is a set of promising optimal test set combinations. The results of the experiments showed that HABC outperformed and yielded better test sets than existing methods (HSS, LAHC, SA_SAT, PICT, TestCover, mATEG_SAT).
Software testing, t-way testing, hybrid artificial bee colony, meta-heuristics, optimization problem.
Tao Yan1, In-Ho Ra2, Jingyu Feng1, Linyun Huang1 and Zhicheng Wu1, 1School of Information Engineering, Putian University, Putian, China and 2School of Computer, Information and Communication Engineering, Kunsan National University, Gunsan, South Korea
The difficulty of rate control for Multi-view video coding(MVC) is how to allocate bits between views. The results of our previous research including the bit allocation among viewpoints uses the correlation analysis among viewpoints to predict the weight of each viewpoint. But when the scene changes, this prediction method will produce a lot of errors. Therefore, this article avoids this situation happening through scene detection. The core of the algorithm is to first divide all images into 6 types of encoded frames according to the structural relationship between disparity prediction and motion prediction, and improve the binomial rate distortion model, and then perform inter-view, frame layer, and basic unit based on the encoded information. Layer bit allocation and code rate control. In this paper, a reasonable bit rate is allocated between viewpoints based on the encoded information, and the frame layer bit rate is allocated using frame complexity and time-domain activity. Experimental simulation results show that the algorithm can effectively control the bit rate of MVC, while maintaining efficient coding efficiency, compared with the current MVC using JVT with fixed quantization parameters.
Multi-view video coding, Rate control, Bit allocation, Rate distortion model, Basic unit layer
Nitin Khosla1 and Dharmendra Sharma2, 1Assistant Director – Performance Engineering, Dept. of Home Affairs, Australia and 2Professor – Computer Science, University of Canberra, Australia
The aim of a semi-supervised neural net learning approach in this paper is to apply and improve the supervised classifiers and to develop a model to predict CPU usages under unpredictable peak load (under stress conditions) in a large enterprise applications environment with several hundred applications hosted and with large number of concurrent users. This method forecasts the likelihood of extreme use of CPU because of a burst in web traffic mainly due to web-traffic from large number of concurrent users. This model predicts the CPU utilization under extreme load (stress) conditions. Large number of applications run simultaneously in a real time system in an enterprise large IT system. This model extracts features by analysing the work-load patterns of the user demand which are mainly hidden in the data related to key transactions of core IT applications. This method creates synthetic workload profiles by simulating synthetic concurrent users, then executes the key scenarios in a test environment and use our model to predict the excessive CPU utilization under peak load (stress) conditions. We have used Expectation Maximization method with different dimensionality and regularization, attempting to extract and analyse the parameters that improves the likelihood of the model by maximizing and after marginalizing out the unknown labels. Workload demand prediction with semi-supervised learning has tremendous potential tin capacity planning to optimize and manage IT infrastructure at a lower risk.
Semi-supervised learning, Performance Engineering, Stress testing, Neural Nets, Machine learning applications
Tao Yan1, In-Ho Ra2, Zhicheng Wu1, Shaojie Hou1 and Jingyu Feng1, 1School of Information Engineering, Putian University, Putian, China and 2School of Computer, Information and Communication Engineering, Kunsan National University, Gunsan, South Korea
Current multi-view video coding(MVC) reference model in joint video team(JVT) does not provide efficient rate control schemes, this paper proposes rate control algorithm for multi-view video coding. Aiming at the situation that the multi-view video coding (MVC) bit rate control has not been thoroughly studied at present, based on the analysis of the shortage of rate distortion models and the characteristics of multi-view video coding in the existing video bit rate control, the paper proposes A multi-view video coding rate control algorithm based on the quadratic rate distortion (RD) model is presented. The core of the algorithm is to first divide all images into 6 types of encoded frames according to the structural relationship between disparity prediction and motion prediction, and improve the binomial rate distortion model, and then perform inter-view, frame layer, and basic unit based on the encoded information. Layer bit allocation and code rate control. In this paper, a reasonable bit rate is allocated between viewpoints based on the encoded information, and the frame layer bit rate is allocated using frame complexity and time-domain activity. Experimental simulation results show that the algorithm can effectively control the bit rate of multi-view video coding, while maintaining efficient coding efficiency, compared with the current MVC using JVT with fixed quantization parameters.
MVC(multi-view video coding), Rate control, Bit allocation, Human visual characteristics
Ching-Fang Hsu, Hao-Chen Kao, Yun-Chung Ho, Eunice Soh and Ya-Hui Jhang, Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
Distance-adaptive Elastic Optical Networks (EONs) is an efficient network environment, traffic demands in it can flexibly assign the number of Frequency slots (FSs) according to the bandwidth. In distanceadaptive EON, routing, modulation format, and spectrum assignment (RMSA) is the main issue to be solved. Appropriately choosing a routing path and trying to select a better modulation format, the spectral resource can be allocated efficiently. However, due to maximum transmission distance (MTD) constraints, if the transmission distance exceeds MTD, the optical path may fail to be setup unless virtualized elastic regenerators (VERs) are exploited. VERs can regenerate signal and reset the accumulated length of experienced path. For deployment cost reduction, placing VERs strategically would be an important issue. Such a network with limited amount of VERs is so-called translucent EON. In previous studies, neither MTD nor the criticality of a node in all shortest paths and hence the spectrum efficiency is not significant. In this work, we proposed two VER placement strategies to compare with related literature. Our strategies tend to place VERs on the nodes in longer transmission segments and result in better FS consumption. Simulation results demonstrate that the proposed schemes attain notable improvement on blocking performance.
Elastic Optical Networks (EONs); Distance Adaptive; Translucent Optical Networks; Virtualized Elastic Regenerator (VER); Routing, Modulation Format and Spectrum Assignment (RMSA); VER Placement
Abdullah Sohrab Khan1 and Manzoor Ahmed Khan2, 1Fakultät Elektrotechnik und Informatik, Technische Universität Berlin, Berlin, Germany and 2Collage of IT, UAE University, UAE
Autonomous vehicles are expected to arrive sooner than expected. Autonomous vehicles of higher automation rely on both onboard and on-road deployed sensory data. Advanced approaches to improve the situational awareness of autonomous vehicle suggest to implement federated learning, where the raw data (which is huge) need to be transferred from vehicles to edges and vice-versa. This consequently generates dynamically varying communication link demands. The envisioned new era of autonomous driving demands the strong interplay of key stakeholders like: city authorities and communication network providers. In this paper, we study this relationship, where the traffic efficiency on different road segments may be achieved by incentivizing the autonomous vehicles through better communication resources on alternate routes. We modelled the profit functions of the involved stakeholders and carried out experiments. We made use of the real traffic data, which were collected through object detection and object tracking sensors deployed at Ernst-Reuter-Platz, Berlin, Germany. The traffic data of around 6 months was used. We designed and developed an extensive validation framework to validate the results of approach, which comprises of SUMO, network simulator, and contributed modules. Results show that proposed approach does achieve the traffic efficiency and help network operators to use the underutilized network resources on the alternate paths.
Autonomous vehicle, Future Mobile Network, Resource Allocation.
Ruyun Li1, Peng Ouyang2, Dandan Song2, and Shaojun Wei3, 1Department of Microelectronics and Nanoelectronics, Tsinghua University, Beijing, China, 2TsingMicro Co. Ltd., Beijing, China and 3Department of Microelectronics and Nanoelectronics, Tsinghua University, Beijing, China
Recently, speaker embedding extracted by deep neural networks (DNN) has performed well in speaker verification (SV). However, it is sensitive to different scenarios, and it is too computationally intensive to be deployed on portable devices. In this paper, we first combine rhythm features and MFCC feature to improve the robustness of speaker verification. Rhythm features can reflect the distribution of phonemes and help reduce the average error rate (EER) in speaker verification, especially in intra-speaker verification. Besides, we propose a multi-task knowledge distillation architecture that transfers the embedding-level and label-level knowledge of a well-trained large teacher to a highly compact student network. Results show that rhythm features and multi-task knowledge distillation significantly improve the performance of the student network. In the ultra-short duration scenario, a student network can even achieve a 32% relative EER reduction, using only 14.9% of the parameters in the teacher network.
Multi-task learning, Knowledge distillation, Rhythm variation, Angular softmax, Speaker verification.
Aurora Y. Mu, Department of Mathmatics, Western Connecticut State University, Danbury, Connecticut, United States of America
This paper establishes a methodology to build hybrid machine learning models, aiming to combine the power of different machine learning algorithms on different types of features and hypothesis. A generic cost-based outlier removal algorithm is introduced as a step of preprocess of training data, we implement a hybrid machine learning model for a crediting problem, and experiment combination of three types of machine learning algorithms SVM, DT and LR. The new hybrid models shows improvement in performance compared to the traditional single SVM, DT, and LR. This new methodology can be further explored with other algorithms and applications.
Machine Learning, Outlier Removal, Credit Score Modelling, Hybrid Learning Model
Sajad Fathi Hafshejaniy1, Saeed Vahidian3, Zahra Moaberfard2 and Bill Lin3, 1Department of Computer Science, McGill University, Montreal, Canada, 2Department of Computer Science, Apadana University, Shiraz, Iran and 3Department of Electrical and Computer Engineering, University of California San Diego, CA, USA
Low-rank matrix factorization problems such as non negative matrix factorization (NMF) can be categorized as a clustering or dimension reduction technique. The latter denotes techniques designed to find representations of some high dimensional dataset in a lower dimensional manifold without a significant loss of information. If such a representation exists, the features ought to contain the most relevant features of the dataset. Many linear dimensionality reduction techniques can be formulated as a matrix factorization. In this paper, we combine the conjugate gradient (CG) method with the Barzilai and Borwein (BB) gradient method, and propose a BB scaling CG method for NMF problems. The new method does not require to compute and store matrices associated with Hessian of the objective functions. Moreover, adapting a suitable BB step size along with a proper nonmonotone strategy which comes by the size convex parameter ηk, results in a new algorithm that can significantly improve the CPU time, efficiency, the number of function evaluation. Convergence result is established and numerical comparisons of methods on both synthetic and real-world datasets show that the proposed method is efficient in comparison with existing methods and demonstrate the superiority of our algorithms.
Barzilai and Borwein Method, Matrix factorization, Non-Convex, Nonmonotone method
Amir Farzad and T. Aaron Gulliver, Department of Electrical and Computer Engineering, University of Victoria, PO Box 1700, STN CSC, Victoria, BC Canada
Dealing with imbalanced data is one the main challenges in machine/deep learning algorithms for classification. This issue is more important with log message data as it is typically imbalanced and negative logs are rare. In this paper, a model is proposed to generate text log messages using a SeqGAN network. Then features are extracted using an Autoencoder and anomaly detection and classification is done using a GRU network. The proposed model is evaluated with two imbalanced log data sets, namely BGL and Openstack. Results are presented which show that oversampling and balancing data increases the accuracy of anomaly detection and classification.
Deep Learning, Oversampling, Log messages, Anomaly detection, Classification
Shaikh Farhad Hossain, Kazuhisa Hirose, Shigehiko Kanaya and Md. Altaf-Ul-Amin, Computational Systems Biology Lab, Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), 8916-5, Takayama, Ikoma, Nara 630-0192, Japan
Music is an entertainment part of our lives who’s the important supporting elements are musical instruments. The acoustic drum plays a vital role when a song is sung. With the era, the style of the musical instruments is changing by keeping identical tune such as electronic drum. In this work, we have developed "Virtual Musical Drums" by the combination of MEMS 3D accelerometer sensor data and machine learning. Machine learning is spreading in all arena of problem-solving and MEMS sensor is converting the large physical system to small system. In this work, we have designed eight virtual drums for two sensors. We have found 91.42% detection accuracy at simulation environment and 88.20% detection accuracy at real-time environment with 20% windows overlapping. Although system detection accuracy was satisfying but the virtual drum sound was non-realistic. Then, we implement 'multiple hit detection within a fixed interval, sound intensity calibration and sound tune parallel processing' and select 'virtual musical drums sound files' based on acoustic drum sound pattern and duration. Finally, we have completed our "Playing Virtual Musical Drums" and played the virtual drum successfully like an acoustic drum. This work has shown a different application of MEMS sensor and machine learning. It shows more, data not only for information but also music entertainment.
Virtual musical drum, MEMS, SHIMMER, support vector machines (SVM) and k-Nearest Neighbors (kNN)
Mashael Maashi, Nujood Alwhibi, Fatima Alamr, Rehab Alzahrani, Alanoud Alhamid and Nourah Altawallah, Department of Software Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia
When manufacturers equipment encounters an unexpected failure, or undergo unnecessary maintenance pre-scheduled plan, which happens for a total of millions of hours worldwide annually, this is time-consuming and costly. Predictive maintenance can help with the use of modern sensing technology and sophisticated data analytics to predict the maintenance required for machinery and devices. The demands of modern maintenance solutions have never been greater. The constant pressure to demonstrate enhanced cost-effectiveness return on investment and improve the competitiveness of the organization is always combined with the pressure of improving equipment productivity and keep machines running at the maximum output. In this paper, we propose maintenance prediction approach based on a machine learning technique namely random forest algorithm. The main focus is on the industrial duct fans as it is one of the most common equipment in most manufacturing industries. The experimental results show the accuracy, reliability of proposed Predictive Maintenance approach.
Predictive Maintenance, Maintenance, Random Forest, Machine Learning & Artificial Intelligence
Che-Yuan Yang and Jian-Hung Chen, Department of Computer Science, Chung Hua University, Hsinchu, Taiwan
Solve timetable problem with evolution strategy; build real system for any department in university that can used easily.
University timetable problem, Evolution algorithm, evolution strategy.
Courtney Foots1, Palash Pal2, Rituparna Datta1, and Aviv Segev1, 1Department of Computer Science, University of South Alabama, Mobile, United States and 2University Institute of Technology, Burdwan University, India
Computer performance is affected by hardware. Thus, vendors aim to build computers with the most effective hardware. In the present work, we propose a set of methods to classify vendors based on estimated computer performance and predict computer performance based on hardware components. For vendor classification, we use the highest and lowest estimated performance and frequency of occurences of each vendor in the dataset to create performance zones. These zones can be used to list vendors whose computers satisfy a given performance requirement. For performance prediction, we use multilayered neural networks, which account for nonlinearity in our data. Several neural network architectures are analyzed in comparison to linear, quadratic, and cubic regression. Our results are the Mean Squared Error (MSE) and correlation between published and predicted performance. Experiments show that neural networks have the lowest MSE, whereas cubic regression has higher correlation than neural networks and other regression techniques.
Computer Hardware, Performance Prediction and Classification, Neural Networks, Statistical Learning, Regression.
Lamia Al-Horaibi and Daniyal Alghazzawia, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Twitter is an open-source communication network where users share their views and ideas. Due to the open accessing policy into the Twitter network, it attracted Spammers, where they see it as a supporting tool to spread spam messages. Lately, there are quite an amount of survey papers available on Twitter spam detection. In this paper, we do a systematic literature review on Twitter spam detection techniques using different Machine Learning approaches. For this purpose, we consider the available published research works from 2014 to 2019. We choose 17 studies for review their methods, algorithms, evaluation measures, datasets, and finally, result comparison of the studies. There is a noticeable trend of future research in this area, and this survey can act as a reference point for the future direction of research.
Twitter, Spam Detection, Machine Learning Algorithm.
MAHAMAT Moussa Dogoumi Department of Electrical engineering University of Quebec in Trois-Rivieres Quebec, CANADA and Adam Skorek, Department of Electrical engineering, University of Quebec in Trois-Rivieres Quebec, CANADA
The FANET networks have seen some progress recently. Solutions for positioning network nodes are numerous. One in particular offers a comprehensive and interesting approach. It is VBCA. This method has the particularity of positioning the nodes in 3-D. In general, the 3-D positioning becomes an NP-Hard problem. But with VBCA positioning is relatively simple.In this paper we present an optimization of communication in a topology based on the VBCA (Virtual forces Based Clustering Algorithm). The topology is optimized by the VBCA algorithm for a better coverage area. This method is 40% more efficient than existing approaches in terms of area coverage. In the first versions of VBCA, the performances in terms of communication between nodes have not been tested. The resulting network performances are very encouraging, in terms of throughput, delay and packets loss. Therefore, this work provides a first answer to the performances in terms of network.
VBCA, FANET, Communication, Topology, Positioning, Clustering
Ramesh Kumar Kait, Renu Jangra, Kurukshetra University, Kurukshetra, India
The protocols of routing in wireless sensor networks have a great role in the network performances as a good energetic organization, network’s lifetime and so on. The routing protocols are developed based on the different schemes like clustering, chaining, cost based etc. The Wireless Sensor Networks consists of a big quantity of nodes which are sometimes difficult to handle. So, the best way is to combine some nodes and make a cluster. Making a cluster is a technique called clustering; which puts a limit on the energy used by the sensor nodes. The communication and management of nodes in the cluster are handled with the help of the cluster head. The existing DEEC works efficiently during the communication and exist for a long time in the network. But, in this paper, the selection of cluster head among cluster nodes is done by the probability rule which is based on ACO. The cluster nodes send data to the cluster head, which further send related information to the base station.ACO-DEEC (Ant Colony Optimization based Distributed Energy Efficient Clustering protocol) calculates the probability rule to select the cluster head depend on the parameters: distance and power of the nodes. Therefore, this algorithm improves the energy usage, number of packets received at the base station, dead nodes better than existing DEEC protocol.
Ant Colony optimization, Wireless sensor Network, DEEC Protocol, ACO-DEEC, Cluster Head
Jacob Danovitch, Institute for Data Science Carleton University Ottawa, CA
Many computational social science projects examine online discourse surrounding a specific trending topic. These works often involve the acquisition of large-scale corpora relevant to the event in question to analyze aspects of the response to the event. Keyword searches present a precision-recall trade-off and crowd-sourced annotations, while effective, are costly. This work aims to enable automatic and accurate ad-hoc retrieval of comments discussing a trending topic from a large corpus, using only a handful of seed news articles.
political events, Specific-Correspondance LDA, sporting matches, Prepare for Different Social Media Sites, Competition Check, Prepare for Different Social Media Sites,SEO Keyword Research
Haider Khalid1 and Vincent Wade2, 1School of Computer Science and Statistics, Trinity College Dublin, University of Dublin, Dublin, Ireland and 2ADAPT Centre, Trinity College Dublin, University of Dublin, Dublin, Ireland
A conversational system needs to know how to switch between topics to continue the conversation for a more extended period. Detecting topics from dialogue corpus has become an important task for a conversation and accurate prediction of conversation topics is important for creating coherent and engaging dialogue systems. This paper is based on topic detection with Parallel Latent Dirichlet Allocation (PLDA) Model by clustering a vocabulary of known similar words based on TF-IDF scores and Bag of Words (BOW) approach. In the experiment, we use K-mean clustering with Elbow Method for interpretation and validation of consistency within-cluster analysis to select the optimal number of clusters. We evaluate our approach by comparing it with traditional LDA and clustering technique. The experimental results show that combining PLDA with Elbow method selects the optimal number of clusters and refine the topics for the conversation.
Conversational dialogue, latent Dirichlet allocation, topic detection, topic modelling, text-classification
Suman Dowlagar and Radhika Mamidi, Langauge Technologies and Research Center, KCIS, International Institute of Information Technology, Gachibowli, Hyderabad, India
The presence of irrelevant features in the data leads to high dimensionality and complexity in the machine learning models. Feature selection solves the problems of high dimensionality by discarding irrelevant features from the feature space, thus reducing the model complexity and enhancing accuracy. In this paper, we define the most discriminative and high informative (MDHI) feature selection method. Discriminative information between the features is computed using clustering and similarity metrics. The information gain between the class label and feature helps to select a highly informative feature. The MDHI method is evaluated on different datasets and is compared with various state-of-art feature selection methods. The results show that this method performs better for classification tasks.
feature selection, high dimensionality, dimensionality reduction, clustering, similarity, information gain
Abdullah Aref1, Rana Husni Al Mahmoud2, Khaled Taha3, and Mahmoud Al-Sharif4, 1Computer Science Department, Princess Sumaya University for Technology,Amman, Jordan, 2Computer Science Department, University of Jordan, Amman, Jordan, 3Social Media Lab, Trafalgar AI, Amman, Jordan, 4Social Media Lab, Trafalgar AI, Amman, Jordan
The aim of sentiment analysis is to automatically extract the opinions from a certain text and decide its sentiment. In this paper, we introduce the first publicly-available Twitter dataset on Sunnah and Shia (SSTD), as part of a religious hate speech which is a sub problem of the general hate speech. We, further, provide a detailed review of the data collection process and our annotation guidelines such that a reliable dataset annotation is guaranteed. We employed many stand-alone classification algorithms on the Twitter hate speech dataset, including Random Forest, Complement NB, DecisionTree, and SVM and two deep learning methods CNN and RNN . We further study the influence of word embedding dimensions FastText and word2vec. In all our experiments, all classification algorithms are trained using a random split of data (66% for training and 34% for testing). The two datasets were stratified sampling of the original dataset. The CNN-FastText achieves the highest FMeasure (52.0%) followed by the CNN-Word2vec (49.0%), showing that neural models with FastText word embedding outperform classical feature-based models.
Hate Speech, Dataset, Text classification, Sentiment analysis
Ashwini Badgujar, Sheng Cheng, Andrew Wang, Kai Yu, Paul Intrevado and David Guy Brizan, University of San Francisco, San Francisco, CA, USA
In this project, we continuously collect data from the RSS feeds of traditional news sources. We apply several pre-trained implementations of named entity recognition (NER) tools, quantifying the success of each implementation. We also perform sentiment analysis of each news article at the document, paragraph and sentence level, with the goal of creating a corpus of tagged news articles that is made available to the public through a web interface. Finally, we show how the data in this corpus could be used to identify bias in news reporting.
Content Analysis, Named Entity Recognition, Sentiment Analysis
Alaidine Ben Ayed, Ismaïl Biskri and Jean-Guy Meunier, Université du Québec à Montréal (UQAM), Canada
In this paper, we present VSMbM; a new metric for automatically generated text summaries evaluation. VSMbM is based on vector space modelling. It gives insights on to which extent retention and fidelity are met in the generated summaries. Two variants of the proposed metric, namely PCA-VSMbM and ISOMAP VSMbM, are tested and compared to Recall-Oriented Understudy for Gisting Evaluation (ROUGE): a standard metric used to evaluate automatically generated summaries. Conducted experiments on the Timeline17 dataset show that VSMbM scores are highly correlated to the state-of-the-art Rouge scores.
Automatic Text Summarization, Automatic summary evaluation, Vector space modelling.
Omar Mossad, Amgad Ahmed, Anandharaju Raju, Hari Karthikeyan, and Zayed Ahmed, Simon Fraser University, Burnaby, Canada
Machine based text comprehension has always been a significant research field in natural language processing. Once a full understanding of the text context and semantics is achieved, a deep learning model can be trained to solve a large subset of tasks, e.g. text summarization, classification and question answering. In this paper we focus on the question answering problem, specifically the multiple choice type of questions. We develop a model based on BERT, a state-of-the-art transformer network. Moreover, we alleviate the ability of BERT to support large text corpus by extracting the highest influence sentences through a semantic similarity model. Evaluations of our proposed model1 demonstrate that it outperforms the leading models in the MovieQA challenge and we are currently ranked first2 in the leader board with test accuracy of 87.79%. Finally, we discuss the model shortcomings and suggest possible improvements to overcome these limitations.
Natural Language Processing, Question Answering, Semantic Similarity & BERT
Sourav Sen, Department of Physics, Duke University, Durham, NC, USA
Informal transliteration from other languages to English is prevalent in social media threads, instant messaging, and discussion forums. Without identifying the language of such transliterated text, users who do not speak that language cannot understand its content using translation tools. We propose a Language Identification (LID) system, with an approach for feature extraction, which can detect the language of transliterated texts reasonably well even with limited training data and computational resources. We tokenize the words into phonetic syllables and use a simple Long Short-term Memory (LSTM) network architecture to detect the language of transliterated texts. With intensive experiments, we show that the tokenization of transliterated words as phonetic syllables effectively represents their causal sound patterns. Phonetic syllable tokenization, therefore, makes it easier for even simpler model architectures to learn the characteristic patterns to identify any language.
Transliteration, Language Identification System (LID), Phonetic Syllables, Long Short-term Memory (LSTM) recurrent neural networks
Emilia Apostolova PhD1, Joe Morales2, Ioannis Koutroulis, MD PhD3 and Tom Velez, PhD, JD2, 1Language.ai, Chicago, IL, USA, 2Computer Technology Associates, Ridgecrest, CA, USA and 3Children's National Health System, Washington, DC, USA
While there has been considerable progress in building deep learning models based on clinical time series data, overall machine learning (ML) performance remains modest. Typical ML applications struggle to combine various heterogenous sources of Electronic Medical Record (EMR) data, often recorded as a combination of free-text clinical notes and structured EMR data. The goal of this work is to develop an approach for combining such heterogenous EMR sources for time-series based patient outcome predictions. We developed a deep learning framework capable of representing free-text clinical notes in a low dimensional vector space, semantically representing the overall patient medical condition. The free-text based time-series vectors were combined with time-series of vital signs and lab results and used to predict patients at risk of developing a complex and deadly condition: acute respiratory distress syndrome. Results utilizing early data show significant performance improvement and validate the utility of the approach.
Natural Language Processing, Clinical NLP, Time-series data, Machine Learning, Deep Learning, Free-text and structured data, Clinical Decision Support, ARDS, COVID-19
Rajeev Kanth1, Tuomas Korpi1 and Jukka Heikkonen2, 1School of Engineering and Technology, Savonia University of Applied Sciences, Opistotie 2, 70101 Kuopio Finland and 2Department of Future Technologies, University of Turku, Vesilinnantie 5, 20500 Turku Finland
In this article, a Smart Ping Pong Paddle has been introduced as an example of the use of sensor technology in sports. We have devised an accelerometer and a gyroscope sensor for the analyzing purpose and have gathered motion data of the game object to make a real-time 3D replica of the paddle to get an actual orientation of it. Technical details and principles of how to get the digital motion processing data from sensor to microcontroller and again transfer that wirelessly to a 3D modeling software are examined. Technical details are applied in practice, and the working demo of Smart Ping Pong paddle is built. Also, a couple of examples of other similar applications in the realm of object orientation sensing are overviewed.
Accelerometer Sensor, Gyroscope Sensor, Ping Pong Paddle, and Motion Analysis
Faïza Tabbana, Assistant Professor, Department of Telecommunication Engineering, Military Academy, Fondek Jedid Tunisia
A wireless sensor network (WSN) consists of a large number of sensors which are spatially distributed and capable of computing, communicating and sensing. A WSN is similar to MANET (Mobile Ad-hoc Network) whose nodes are mobile and communicate directly without base station. Wireless Sensor Networks (WSNs) are characterized by multi-hop wireless connectivity, frequently changing network topology and efficiently need routing protocols. The main purpose of this paper is to compare the performance of three different protocols AODV (Ad-hoc on demand distance vector routing), DSDV (Destination-Sequenced Distance-Vector Routing) and ZRP (Zone Routing Protocol) which constitute a good combination of on-demand (reactive), table-driven (proactive) and hybrid protocols, respectively. The performance of these protocols varies depending on the simulation environment. It will be analyzed in two ways. Firstly, by varying nodes from 10 to 100. Another way is by keeping the number of nodes constant and varying the speed of nodes from 10 m/s to 90 m/s. A Comparison between DSDV, AODV and ZRP protocols is discussed in detail throughout this paper.
Wireless Sensor Network, AODV, DSDV, ZRP, Performance Metrics
Matyokubov U.K., Davronbekov D.A., Department of Mobil communication, Tashkent university of information technologies named after Muhammad al-Khwarizmi, Tashkent, Uzbekistan
It is possible to significantly save economic resources by reducing energy consumption in the energy system of mobile communication systems. The issue of reducing energy consumption is very important, especially in developing countries or regions where more disruptions occur in the supply of electricity. It is known that all technical means can not be achieved with high working efficiency, without a continuous supply of electricity. This issue can be solved in several ways, such as saving electricity in mobile communication systems, the use of long-term batteries, the use of renewable energy sources, the use of electric-diesel. To study the power consumption of base stations in mobile communication systems in terms of Electrical and logistics parameters, we have observed the processes that affect the quality of communication through power consumption and power outages of nearly 250 base stations for almost a year. Thus, the role of cellular communication systems in the overall energy consumption and energy consumption of key elements (air conditioning and transmission devices) in ensuring communication was studied. Elements of mobile communication systems are recommended to use a technique based on algorithms of new transmission of electricity, that is, renewable energy sources.
Mobile communication systems, base station, electrical energy, surviability, reliability.
Siddharth Sekar, Nirmit Agarwal and Vedant Bapodra, Department of Computer Engineering, Mukesh Patel School of Technology Management and Engineering, Mumbai, India
This paper proposes a solution to improve the lives of patients who are paralyzed and/or suffering from different Motor Neuron Diseases (MND) like Amyotrophic Lateral Sclerosis (ALS), Primary Lateral Sclerosis etc. by making them more independent. Patients suffering from these diseases will not be able to move their arms and legs. They also lose their body balance and the ability to speak. Here we propose an IoT based communication controller using the concept of Morse Code Technology. The paper proposes integrating an IoT device with a Smartphone which can be controlled using the IoT device through the concept of Morse Code.
Internet of Things (IoT), Motor Neuron Disease (MND), Amyotrophic Lateral Sclerosis (ALS), Arduino
Prabjot Kaur, Mumbai, Maharashtra, India
Security represents a vital element for sanctioning the widespread adoption of Internet of Things technologies and applications. While not guarantees in terms of system-level confidentiality, credibility and privacy the relevant stakeholders are unlikely to adopt Internet of Things solutions on an oversized scale. In early-stage Internet of Things deployments (e.g., supported RFIDs only), security solutions have principally been devised in an advert hoc approach. This comes from the very fact that such deployments were sometimes vertically integrated, with all elements beneath the management of one body entity during this work we have a tendency to propose a brand new dynamic cipher to access quite one device at the same time during a network employing a single controller by creating use of Dynamic variable cipher security certificate protocol. This protocol uses key matrices thought during this protocol we have a tendency to create use of key matrices and store same key matrix the least bit the human action nodes. So once plain text is encrypted to cipher text at the causing facet, the sender transmits the cipher text while not the key that's to be won’t to decode the message. To access more than one device simultaneously in a network using a single controller by making use of Dynamic variable cipher security certificate protocol.
Internet of things, Security, Encryption, Dynamic Cipher
Mustapha Hedabou, School of Computer and Communication Sciences University Mohammed VI Polytechnic, Benguerir. Morocco 1 Dept. Computer Science. ENSA de Sa University Cadi Ayyad, Marrakch. Morocco
On the past decade, Trusted Platform Modules (TPM) have become a valuable tool for providing a high level of trust on locally executing software. Indeed, in addition to its availability on most commodity computers, TPM are totally free of cost unlike other available Hardware-Based devices while they over the same level of security. Enhancing trust in SaaS services regarding the security and the privacy of the hosted SaaS application services can turn out to be a pertinent application scope of TMP. In this paper we present a design for a trusted SaaS model that gives cloud users more confidence into SaaS services by leveraging TPM functionalities combined with a trusted source code certifying authority facility. In our design, the cloud computing provider hosting the SaaS services acts as a root of trust by providing final cloud users insurance on the integrity of the SaaS application service running on its platform. A new mechanism of multisignature is developed for computing a join signature of SaaS service software by the trusted authority and TPM. A prototype implementation of the proposed design shows that the integrity of SaaS application service before and after it was launched on a cloud provider platform is guaranteed at low cost.
Cloud computing, SaaS services, TPM, trust, Code source certification, Mutlisignature schemes
Salah Harb, M. Omair Ahmad and M.N.S Swamy, Electrical and Computer Engineering Department, Concordia University, 1440 De Maisonneuve, Montreal, Canada
In this paper, a novel and efficient hardware implementation of steganographic cryptosystem based on a public-key cryptography is proposed. Digital images are utilized as carriers of secret data between sender and receiver parties in the communication channel. The proposed public-key cryptosystem offers a separable framework that allows to embed or extract secret data and encrypt or decrypt the carrier using the public-private key pair, independently. Paillier cryptographic system is adopted to encrypt and decrypt pixels of the digital image. To achieve efficiency, a proposed efficient parallel montgomery exponentiation core is designed and implemented for performing the underlying field operations in the Paillier cryptosystem. The hardware implementation results of the proposed steganographic cryptosystem show an efficiency in terms of area (resources), performance (speed) and power consumption. Our steganographic cryptosystem represents a small footprint making it well-suited for the embedded systems and real-time processing engines in applications such as medical scanning devices, autopilot cars and drones.
Image Steganography, Public-Key Cryptography, Homomorphic Cryptosystem, Montgomery Exponentiation & Field Programmable Gate Array