There Is An Abnormal Increase In The Crime Rate And Also The Number Of Criminals Are Increasing, This Leads Towards A Great Concern About The Security Issues. Crime Preventions And Criminal Identification Are The Primary Issues Before The Police Personnel, Since Property And Lives Protection Are The Basic Concerns Of The Police But To Combat The Crime, The Availability Of Police Personnel Is Limited. With The Advent Of Security Technology, Cameras Especially CCTV Have Been Installed In Many Public And Private Areas To Provide Surveillance Activities. The Footage Of The CCTV Can Be Used To Identify Suspects On Scene. This Real Time Criminal Identification System Based On Face Recognition Works With A Fully Automated Facial Recognition System. Here Feature-based Cascade Classifier And OpenCV LBPH (Local Binary Pattern Histograms) Algorithms Are Used For Face Detection And Recognition. This System Will Be Able To Detect Face And Recognize Face Automatically In Real Time. An Accurate Location Of The Face Is Still A Challenging Task. Viola-Jones Framework Has Been Widely Used By Researchers In Order To Detect The Location Of Faces And Objects In A Given Image. Face Detection Classifiers Are Shared By Public Communities, Such As OpenCV.
An Intelligent Licence Plate Detection Method Can Make The Travel More Convenient And Efficient. However, Traditional Methods Are Reasonably Effective Under The Specific Circumstances Or Strong Assumptions Only,. Therefore, A Novel Real-time Car Plate Detection Method Based On Improved Yolov3 Has Been Proposed. In Order To Select The More Precise Number Of Candidate Anchor Boxed And Aspect Ratio Dimensions, The Deep Learning Object Detection Algorithm Is Utilized. As Shown In The Experimental Results, The Method Which Is Proposed By This Paper Is Better Than Original Yolov3. However, Good Performing Models Such As YOLOv3 In More General Object Detection And Recognition Tasks Could Be Effectively Transferred To The License Plate Detection Application With A Small Effort In Model Tuning. This Paper Focuses On The Design Of Experiment (DOE) Of Training Parameters In Transferring YOLOv3 Model Design And Optimising The Training Specifically For License Plate Detection Tasks. The Parameters Are Categorised To Reduce The DOE Run Requirements While Gaining Insights On The YOLOv3 Parameter Interactions Other Than Seeking Optimised Train Settings. The Result Shows That The DOE Effectively Improve The YOLOv3 Model To Fit The Vehicle License Plate Detection Task.
The Vision-Based Live Vehicle Detection And Distance Calculation Project Leverages The State-of-the-art YOLOv5 Algorithm To Provide Real-time Vehicle Detection And Accurate Distance Measurement. This Innovative Solution Is Designed To Seamlessly Analyze Both Live Video Streams And Static Images, Offering A Versatile Platform For Monitoring And Assessing Vehicle Movements. The YOLOv5 Architecture, Known For Its High Accuracy And Speed, Has Been Employed To Detect Vehicles Within The Input Data. The Project Accommodates Two Main Modes Of Operation: Live Video Input And Image Upload. For Live Video Input, Users Can Receive Instant, Real-time Feedback On Vehicle Detection And Distance Calculation.
Weed Identification In Vegetable Plantation Is More Challenging Than Crop Weed Identification Due To Their Random Plant Spacing. So Far, Little Work Has Been Found On Identifying Weeds In Vegetable Plantation. Traditional Methods Of Crop Weed Identification Used To Be Mainly Focused On Identifying Weed Directly; However, There Is A Large Variation In Weed Species. This Paper Proposes A New Method In A Contrary Way, Which Combines Deep Learning And Image Processing Technology. Firstly, A Trained CenterNet Model Was Used To Detect Vegetables And Draw Bounding Boxes Around Them. Afterwards, The Remaining Green Objects Falling Out Of Bounding Boxes Were Considered As Weeds. In This Way, The Model Focuses On Identifying Only The Vegetables And Thus Avoid Handling Various Weed Species. Furthermore, This Strategy Can Largely Reduce The Size Of Training Image Dataset As Well As The Complexity Of Weed Detection, Thereby Enhancing The Weed Identification Performance And Accuracy. To Extract Weeds From The Background, A Color Index-based Segmentation Was Performed Utilizing Image Processing. The Employed Color Index Was Determined And Evaluated Through Genetic Algorithms (GAs) According To Bayesian Classification Error. During The Field Test, The Trained CenterNet Model Achieved A Precision Of 95.6%, A Recall Of 95.0%, And A F1 Score Of 0.953, Respectively. The Proposed Index −19R + 24G −2B ≥ 862 Yields High Segmentation Quality With A Much Lower Computational Cost Compared To The Wildly Used ExG Index. These Experiment Results Demonstrate The Feasibility Of Using The Proposed Method For The Ground-based Weed Identification In Vegetable Plantation.
Face Recognition Play A Vital Role In Variety Of Applications From Biometrics, Surveillance, Security, Identification To The Authentication. In This Paper We Design And Implement A Smart Security System For Restricted Area Where Access Is Limited To People Whose Faces Are Available In The Training Database. First We Are Going To Detect The Face By Detecting The Human Motion. Then Face Recognition Is Performed To Determine The Authority Of The Person To Enter The Sensitive Area. At The Same Time, We Track The Coordinate Of Detected Motion. Failing To Recognize The Face Finally Passes The Estimated Coordinate To Detect The Intruder Automatically. Experimental Results Demonstrate The Effectiveness Of Proposed Security System In Order To Restrict The Unauthorized Access And Enhanced Reliability By Use Of Face Recognition. Although This Reduces The Complexity Of Face Recognition, There Is Still A Concern Regarding The Real-time Protection Of Sensitive Portion. It Should Be Noted That This Problem Is Somewhat A Hard Task And Can Be Solved By Automatically Send Mail The Admin Person Attempting To Trespass The Sensitive Area.
Depression Is A Common And Serious Mental Health Disorder That Affects Millions Of People Worldwide. Early Detection And Diagnosis Of Depression Are Crucial For Effective Treatment And Management Of The Condition. The Use Of Machine Learning Algorithms, Such As Convolutional Neural Networks (CNNs), Has Shown Promising Results In Detecting Depression From Various Data Sources, Including Audio, Text, And Images. This Paper Proposes A CNN-based Approach For Depression Detection Using Facial Images. The Proposed System Involves The Use Of A Pre-trained CNN Model, Such As VGG-16, To Extract Features From Facial Images. The Extracted Features Are Then Used To Train A Support Vector Machine (SVM) Classifier To Detect Depression.
The Worst Possible Situation Faced By Humanity, COVID-19, Is Proliferating Across More Than 180 Countries And About 37,000,000 Confirmed Cases, Along With 1,000,000 Deaths Worldwide As Of October 2020. The Absence Of Any Medical And Strategic Expertise Is A Colossal Problem, And Lack Of Immunity Against It Increases The Risk Of Being Affected By The Virus. Since The Absence Of A Vaccine Is An Issue, Social Spacing And Face Covering Are Primary Precautionary Methods Apt In This Situation. This Study Proposes Automation With A Deep Learning Framework For Monitoring Social Distancing Using Surveillance Video Footage And Face Mask Detection In Public And Crowded Places As A Mandatory Rule Set For Pandemic Terms Using Computer Vision. The Paper Proposes A Framework Is Based On Object Detection Model To Define The Background And Human Beings With Bounding Boxes And Assigned Identifications. In The Same Framework, A Trained Module(mobilenet_v2) Checks For Any Unmasked Individual. And Segment The Human Face Then Show Whether “Mask” Or “Unmask”.
Sign Language Is A Crucial Means Of Communication For The Deaf And Hard Of Hearing Community. This Project Presents A Comprehensive Approach To Real-time Sign Language Detection Using Convolutional Neural Networks (CNN) And The You Only Look Once (YOLO) Object Detection Framework. The Primary Objective Of The Project Is To Bridge The Communication Gap Between Individuals Who Use Sign Language And Those Who May Not Understand It. Firstly, The Implementation Involves Real-time Voice Recognition To Detect Spoken Words And Translate Them Into Corresponding Alphabet Letter Sign Language Gestures. A CNN Model Is Trained On A Custom Dataset Of Sign Language Gestures Associated With Various Spoken Words. This Model Achieves Accurate Word Recognition, Enabling The Translation Of Spoken Language Into Visual Sign Representations. Secondly, The Project Integrates Live Camera Input To Detect Sign Language Gestures Directly From Hand Movements. The YOLO Framework Is Employed To Identify And Localize Individual Signs Corresponding To Letters Of The Alphabet. By Training The YOLO Model On An Annotated Dataset Of Hand Signs, The System Can Accurately Recognize And Display The Appropriate Letter For Each Detected Gesture. The Experimental Results Demonstrate The Effectiveness Of The Proposed Approach. The CNN Model Achieves High Accuracy In Recognizing Spoken Words, Facilitating Accurate Translation Into Sign Language. Additionally, The YOLO-based Sign Language Detection Exhibits Robustness In Identifying Different Sign Gestures From Live Camera Feed, With Real-time Performance Suitable For Practical Applications. The Implementation Of The System Is By Using OpenCV-Python. The System Uses Various Libraries.
The Modern Keyboard For Personal Computers Has Been Developed From A Similar One Used In Typewriters. The Layout Has Remained The Same, But The Computer Keyboard Uses The Making And Breaking Of An Electrical Contact To Detect The Key-press. The Major Disadvantage Of This Concept Is That A Large Amount Of Physical Space Is Needed To Accommodate The Keyboard, Making It Unsuitable For Applications Such As Mobile Phones Where It Placed Limitations On The Screen Size. To Overcome This Drawback, The Touchscreens Were Developed, Which Integrated The Input Mechanism In The Screen Itself. However, Typing On Touchscreens Is Inconvenient For Most Users, Due To The Small Size Of Buttons. Also, As The Touchscreen Typing Keypads Are Integrated In The Computer Software, There Are Some Security Issues.
In This Project, We Propose Using YOLOv7 To Recognize Sign Language Gestures. We Will Use A Dataset Of Images And Videos Of Individuals Performing Various Sign Language Gestures, And Train A YOLOv7 Model To Recognize These Gestures In Real-time. To Do This, We Will First Preprocess The Dataset By Extracting The Relevant Frames And Labeling The Sign Language Gestures. We Will Then Use YOLOv7 To Train A Neural Network To Recognize These Gestures. The Model Will Be Trained Using A Combination Of Image Augmentation Techniques And Transfer Learning To Improve Its Accuracy And Reduce Overfitting. Once The Model Is Trained, We Will Use It To Detect And Classify Sign Language Gestures In Real-time Video Streams. We Will Use A Webcam To Capture Video Input And Apply The YOLOv7 Model To Recognize The Gestures. The Output Will Be Displayed On The Screen, Allowing Individuals Who Are Deaf Or Hard Of Hearing To Communicate More Easily.
In Multimodal Emotion Recognition (SER), Emotional Characteristics Often Appear In Diverse Forms Of Energy Patterns In Spectrograms. Typical Attention Neural Network Classifiers Of SER Are Usually Optimized On A Fixed Attention Granularity. In This Paper, We Apply Multiscale Area Attention In A Deep Convolutional Neural Network To Attend Emotional Characteristics With Varied Granularities And Therefore The Classifier Can Benefit From An Ensemble Of Attentions With Different Scales. To Deal With Data Sparsity,we Conduct Data Augmentation With Vocal Tract Length Perturbation (VTLP) To Improve The Generalization Capability Of The Classifier. We Can Classified Three Various Emotion Detection In Real-time (speech,face,text) Experiments Are Carried Out On The Interactive Emotional Dyadic Motion Capture (IEMOCAP) Dataset. Which, To The Best Of Our Knowledge, Is The State Of The Art On This Dataset.
In This Paper We Describe A Methodology And An Algorithm To Estimate The Real-time Age, Gender, And Emotion Of A Human By Analysing Of Face Images On A Webcam. Here We Discuss The CNN Based Architecture To Design A Real-time Model. Emotion, Gender And Age Detection Of Facial Images In Webcam Play An Important Role In Many Applications Like Forensics, Security Control, Data Analysis, Video Observation And Humancomputer Interaction. In This Paper We Present Some Method & Techniques Such As PCA,LBP, SVM, VIOLA-JONES, HOG Which Will Directly Or Indirectly Used To Recognize Human Emotion, Gender And Age Detection In Various Conditions.
The Use Of Doctor-computer Interaction Devices In The Operation Room (OR) Requires New Modalities That Support Medical Imaging Manipulation While Allowing Doctors’ Hands To Remain Sterile, Supporting Their Focus Of Attention, And Providing Fast Response Times. This Paper Presents “Gestix,” A Vision-based Hand Gesture Capture And Recognition System That Interprets In Real-time The User’s Gestures For Navigation And Manipulation Of Images In An Electronic Medical Record (EMR) Database. Navigation And Other Gestures Are Translated To Commands Based On Their Temporal Trajectories, Through Video Capture. “Gestix” Access The Image Then Control Image What Ever We Give The Command To Processing The Image. A Simple Way To Store The Information Is Image Capturing Of The Handwritten Document And Save It In Image Format. The Method To Transform Handwritten Data Into Electronic Format Is ‘Optical Character Recognition’. It Involves Several Steps Including Pre-processing, Segmentation, Feature Extraction And Post-processing. Many Researchers Have Been Used OCR For Recognizing Character. This System Uses The Android Phone To Capture The Image Of The Document And Further Steps Are Done By OCR. The Main Challenge Is To Recognize The Characters From Different Styles Of Handwriting. Thus, A System Is Designed That Recognizes The Handwritten Data To Obtain An Editable Text. The Output Of This System Depends Upon The Data That Has To Be Written By The Writer. Our System Offers 90% Accuracy For Handwritten Documents And Gives The Easiest Way To Edit Or Share The Recognized Data.
The Goal Of The Paper Is To Improve The Recognition Of The Human Hand Postures In A Human Computer Interaction Application, The Reducing Of The Time Computing And To Improve The User Comfort Regarding The Used Human Hand Postures. The Authors Developed An Application For Computer Mouse Control. The Application Based On The Proposed Algorithm, Hand Pad Color And On The Selected Hand Feature Presents Good Behavior Regarding The Time Computing. The User Has An Increased Comfort In Use Of The System Due To The Proposed Hand Postures. Also, The System Works Well Having The Same Behavior Under Very Low Illuminance Level And High Illuminance Level.
This Study Is An Attempt To Understand And Address The Mental Health Issue, Of Working Professionals Through Facial Expression Recognition. As A Society, We Are All Currently Talking About Ways As To How A Person Who Is Suffering From Any Emotional Issue Can Adopt Certain Ways To Come Out Of A Specific Circumstance And How We As A Society Can Support Such People In These Situations. Our Endeavor Is To Work On A Way Where The Identification Of Such Persons Who Are Going Through A Difficult Phase In Their Life Can Be Performed.
Early Disease Detection And Pets Are Important For Better Yield And Quality Of Crops. With Reduction In Quality Of The Agricultural Product, Disease Plant Can Lead To The Huge Economic Losses To The Individual Farmers. In Country Like India Whose Major Population Is Involved In Agriculture It Is Very Important To Find The Disease At Early Stages. Faster And Precise Prediction Of Plant Disease Could Help Reducing The Losses. With The Significant Advancement And Developments In Deep Learning Have Given The Opportunity To Improve The Performance And Accuracy Of Detection Of Object And Recognition System. This Paper Focuses On Finding The Plant Diseases And Reducing The Economic Losses. We Have Proposed The Deep Leaning Based Approach For Image Recognition. We Have Examined The Three Main Architecture Of The Neural Network: Faster Region-based Convolution Neural Network (Faster R-CNN), Region-based Fully CNN(R-CNN) And Single Shot Multibook Detector (SSD). System Proposed In The Paper Can Detect The Different Types Of Disease Efficiently And Have The Ability To Deal With Complex Scenarios. Validation Results Show The Accuracy Of 94.6% Which Depicts The Feasibility Of Convolution Neural Network And Present The Path For AI Based Deep Learning Solution To This Complex Problem.
Traffic Sign Recognition System (TSRS) Is A Significant Portion Of Intelligent Transportation System (ITS). Being Able To Identify Traffic Signs Accurately And Effectively Can Improve The Driving Safety. Mainly Aims At The Detection And Classification Of Circular Signs. Firstly, An Image Is Preprocessed To Highlight Important Information. Secondly, Hough Transform Is Used For Detecting And Locating Areas. This Paper Brings Forward A Traffic Sign Recognition Technique On The Strength Of Deep Learning, Which Finally, The Detected Road Traffic Signs Are Classified Based On Deep Learning. In This Article, A Traffic Sign Detection And Identification Method On Account Of The Image Processing Is Proposed, Which Is Combined With Convolutional Neural Network (CNN) To Sort Traffic Signs. On Account Of Its High Recognition Rate, CNN Can Be Used To Realize Various Computer Vision Tasks. TensorFlow Is Used To Implement CNN. In The German Data Sets, We Are Able To Identify The Circular Symbol With Best Accuracy.
The Increasing Crime Rate In Crowded Events Or Isolated Areas Has Heightened The Importance Of Security Across All Domains. Computer Vision Has Significant Applications In Addressing A Range Of Problems Through The Identification And Surveillance Of Abnormalities. The Increasing Need To Protect Safety, Security, And Personal Belongings Has Led To A Significant Demand For Video Surveillance Systems Capable Of Recognizing And Interpreting Scenes As Well As Detecting Unusual Events. These Systems Play A Crucial Role In Intelligent Monitoring. This Research Paper Applies Convolutional Neural Network (CNN) Based SSD And Faster RCNN Algorithms To Achieve Automatic Detection Of Guns Or Weapons. The Suggested Approach Entails The Utilization Of Two Distinct Categories Of Datasets. One Dataset Containing Pre-labeled Images.
For Surveillance Purpose, Lots Of Method Were Used By The Researchers But Computer Vision Based Human Activity Recognition (HAR) Technologies/systems Received The Most In- Terest Because They Automatically Distinguish Human Behaviour And Movements From Video Data Utilizing Recorded Details From Cameras. But The Extraction Of Accurate And Opportune Infor- Mation From Video Of Human’s Activities And Behaviours Is Most Important And Difficult Task In Pervasive Computing Environment. Due To Lots Of Applications Of HAR Systems Like In Medical Field, Security, Visual Monitoring, Video Recovery, Entertainment And Irregular Behaviour Detection, The Accuracy Of System Is Most Important Factors For Researchers. This Review Article Presents A Brief Survey Of The Existing Video Or Vision-based HAR System To Find Out Their Challenges And Applications In Three Aspects Such As Recognition Of Activities, Activity Analysis, And Decision From Visual Content Representation. In Many Applications, System Recognition Time And Accuracy Is Most Important Factor And It Is Affected Due To An Increase In The Usage Of Simple Or Low Quality Type Cameras For Automated Systems. So, To Obtain A Better Accuracy And Fast Responses, The Usage Of Demanding And Computationally Intelligent Classification Techniques Such As Deep Learning And Machine Learning Is A Better Option For Researchers. In This Survey, We Addressed Numerous Computationally Intelligent Classification Techniques-based Research For HAR From 2010 To 2020 For A Better Analysis Of The Benefits And Drawbacks Of Systems, The Challenges Faced And Applications With Future Directions For HAR. We Also Present Some Accessible Problems And Ideas That Should Be Discussed In Future Research For The HAR System Utilizing Machine Learning And Deep Learning Principles Due To Their Strong Relevance.
Writing In Air Has Been One Of The Most Fascinating And Challenging Research Areas In Field Of Image Processing And Pattern Recognition In The Recent Years. It Contributes Immensely To The Advancement Of An Automation Process And Can Improve The Interface Between Man And Machine In Numerous Applications. Several Research Works Have Been Focusing On New Techniques And Methods That Would Reduce The Processing Time While Providing Higher Recognition Accuracy. Object Tracking Is Considered As An Important Task Within The Field Of Computer Vision. The Invention Of Faster Computers, Availability Of Inexpensive And Good Quality Video Cameras And Demands Of Automated Video Analysis Has Given Popularity To Object Tracking Techniques. Generally, Video Analysis Procedure Has Three Major Steps: Firstly, Detecting Of The Object, Secondly Tracking Its Movement From Frame To Frame And Lastly Analysing The Behaviour Of That Object. For Object Tracking, Four Different Issues Are Taken Into Account; Selection Of Suitable Object Representation, Feature Selection For Tracking, Object Detection And Object Tracking. In Real World, Object Tracking Algorithms Are The Primarily Part Of Different Applications Such As: Automatic Surveillance, Video Indexing And Vehicle Navigation Etc. The Project Takes Advantage Of This Gap And Focuses On Developing A Motion-to-text Converter That Can Potentially Serve As Software For Intelligent Wearable Devices For Writing From The Air. This Project Is A Reporter Of Occasional Gestures. It Will Use Computer Vision To Trace The Path Of The Finger. The Generated Text Can Also Be Used For Various Purposes, Such As Sending Messages, Emails, Etc. It Will Be A Powerful Means Of Communication For The Deaf. It Is An Effective Communication Method That Reduces Mobile And Laptop Usage By Eliminating The Need To Write
Visually Impaired People Are Unaware Of The Danger That They Are Facing In Their Life. They May Face Many Challenges While Performing Their Daily Activity Even In Their Familiar Environments. Vision Is The Necessary Human Senses And It Plays The Important Role In Human Perception About Surrounding Environment. Hence, There Are Variety Of Computer Vision Products And Services Which Are Used In The Development Of New Electronic Aids For Those Blind People. In This Paper We Designed To Provide Navigation To Those People. It Guides The People About The Object As Well As Provides The Distance Of The Object. The Algorithm Itself Calculates The Distance Of The Object. Here It Also Provides The Audio Jack To Insist Them About The Object. Here We Are Using SSD Algorithm For Object Detection And Calculating The Distance Of The Object By Using Monodepth Algorithm.
The Worst Possible Situation Faced By Humanity, COVID-19, Is Proliferating Across More Than 180 Countries And About 37,000,000 Confirmed Cases, Along With 1,000,000 Deaths Worldwide As Of October 2020. The Absence Of Any Medical And Strategic Expertise Is A Colossal Problem, And Lack Of Immunity Against It Increases The Risk Of Being Affected By The Virus. Since The Absence Of A Vaccine Is An Issue, Social Spacing And Face Covering Are Primary Precautionary Methods Apt In This Situation. This Study Proposes Automation With A Deep Learning Framework For Monitoring Social Distancing Using Surveillance Video Footage And Face Mask Detection In Public And Crowded Places As A Mandatory Rule Set For Pandemic Terms Using Computer Vision. The Paper Proposes A Framework Is Based On YOLO Object Detection Model To Define The Background And Human Beings With Bounding Boxes And Assigned Identifications. In The Same Framework, A Trained Module Checks For Any Unmasked Individual. The Automation Will Give Useful Data And Understanding For The Pandemic’s Current Evaluation; This Data Will Help Analyse The Individuals Who Do Not Follow Health Protocol Norms.
One Of The Major Reasons Behind Car Accidents Is The Drowsy Nature Acquired By A Driver While Driving Any Vehicle. Owing To The Ongoing Scenario, In This Project, We Aim To Develop A Real Time Driver Drowsiness Detection System In Order To Detect The Drivers’ Fatigue Status, Such As Dozing, Flickering Of Eye Lids And Time Span Of Eye Closure Without Having To Equip Their Bodies With Devices. The Objective Of This Project Is To Build A Drowsiness Detection System That Will Detect That A Person’s Eyes Are Closed For A Few Seconds. This System Will Alert The Driver When Drowsiness Is Detected. Apart From CNN, Computer Vision Also Plays A Major Role To Detect The Drowsiness Pattern Of The Driver. Cloud Architecture Has Also Proved To Be Beneficial In Case Of Capturing And Analyzing Real Time Video Streams.
Drone Is One Of The Latest Drone Technologies That Grows With Multiple Applications; One Of The Critical Applications Is For Fire-fighting Drones Such As Water Hose Carrying For Firefighting. One Of The Main Challenges Of The Drone Technologies Is The Non-linear Dynamic Movement Caused By A Variety Of Fire Conditions. One Solution Is To Use A Nonlinear Controller Such As Reinforcement Learning. In This Paper, Reinforcement Learning Has Been Applied As Their Key Control System To Improve The Conventional Approach, Which Is The Agent (drone) That Will Interact With The Environment Without Need Of The Controller For The Flying Process. This Paper Is Introduced An Optimization Method For The Hyperparameter In Order To Achieve A Better Reward. In Addition, We Only Concentrate On The Learning Rate (alpha) And Potential Reward Factor Discount (gamma) For Optimization In This Paper. From This Optimization, The Better Performance And Response From Our Result By Using Alpha = 0.1 & Gamma = 0.8 With Reward Produced 6100 And It Takes 49 Seconds In The Learning Process.
A Systematic And Exact Detection Of An Object Is A Foremost Point In Computer Vision Technology, With The Unfolding Of Recent Deep Learning Techniques, The Precision Of Detecting An Object Has Increased Greatly Thereby Igniting The Interest In This Area To Large Extent. The Main Aim Is To Integrate The Stateof-the-art Deep Learning Method On Pedestrian Object Detection In Real-time With Improved Accuracy. One Of The Crucial Problems In Deep Learning Is Using Computer Vision Techniques, Which Tend To Slow Down The Process With Trivial Performance. In This Work, An Improved SSD Transfer Learning-based Deep Learning Technique Is Used For Object Detection. It Is Also Shown That This Approach Can Be Used For Solving The Problem Of Object Detection In A Sustained Manner Having The Ability To Further Separate Occluded Objects. Moreover, The Use Of This Approach Has Enhanced The Accuracy Of Object Detection. The Network Used Is Trained On A Challenging Data Set And The Output Obtained Is Fast And Precise Which Is Helpful For The Application That Requires Object Detection.
The Rapid Development Of Artificial Intelligence Has Revolutionized The Area Of Autonomous Vehicles By Incorporating Complex Models And Algorithms. Self-driving Cars Are Always One Of The Biggest Inventions In Computer Science And Robotic Intelligence. Highly Robust Algorithms That Facilitate The Functioning Of These Vehicles Will Reduce Many Problems Associated With Driving Such As The Drunken Driver Problem. In This Paper Our Aim Is To Build A Deep Learning Model That Can Drive The Car Autonomously Which Can Adapt Well To The Real-time Tracks And Does Not Require Any Manual Feature Extraction. This Research Work Proposes A Computer Vision Model That Learns From Video Data. It Involves Image Processing, Image Augmentation, Behavioural Cloning And Convolutional Neural Network Model. The Neural Network Architecture Is Used To Detect Path In A Video Segment, Linings Of Roads, Locations Of Obstacles, And Behavioural Cloning Is Used For The Model To Learn From Human Actions In The Video.
Action Recognition In Videos, Especially For Violence Detection, Is Now A Hot Topic In Computer Vision. The Interest Of This Task Is Related To The Multiplication Of Videos From Surveillance Cameras Or Live Television Content Producing Complex 2D + T Data. State-of-the-art Methods Rely On End-to-end Learning From 3D Neural Network Approaches That Should Be Trained With A Large Amount Of Data To Obtain Discriminating Features. To Face These Limitations, We Present In This Article A Method To Classify Videos For Violence Recognition Purpose, Byusingaclassical 2D Convolutional Neural Network(CNN). The Strategy Of The Method Is Two-fold: We Start By Building Several 2D Spatio-temporal Representations From An Input Video, The New Representations Are Considered To Feed The CNN To The Train/test Process. The Classification Decision Of The Video Is Carried Out By Aggregating The Individual Decisions From Its Different 2D Spatio-temporal Representations. An Experimental Study On Public Datasets Containing Violent Videos Highlights The Interest Of The Presented Method.