NotesWhat is notes.io?

Notes brand slogan

Notes - notes.io

Cybersecurity interest?



1. At Amazon people work in corporate office buildings, data centers, and remotely. Imagine the situation where someone's badge was stolen and is being used by someone else to enter a building. How would you detect such an event and what data sets would you use to do that analysis?

when you had lost your badge, cut off, next activities that occur with that badge
- data - colllected everytime u enter and leave the building
- we would track down badge number, see if theres been any activity
- are they entering a sepcifci building.
- cadence of it, where are they going
- workign remotely under VPN (assumption)
- IP address, working remotely
- Rohit badge working from office
- when was person with badge building

- badge is stolen (1) or not stolen (0) - binary classification
- probability
-- random forest model - ensembling grossly overfit trees
- correlated results
- cadence of work schedule, keeping track of location, other activities that are suspicious in area, dept that you are in (maybe its somethign goign on in department)


1. (Video facial recognition might be an answer provided here. If so, remove video from the equation.) If there wasn't a facial recognition capability what can badge reader logs tell you? (Do we have a history of building activity for everyone? Was the person with the stolen badge entering a new building for the first time?)
2. Suppose you had both building access logs and network logs that recorded network and logon activity for all individuals on the network. How could you use these to detect the imposter?

network that you arent expectign
network logs - occurring at odd time, or at high frequency, somethign that is not normal, red flags

3. If you could geolocate where people are logging in from, or if you had travel logs for individuals, how might those data sets be used to improve your analytic approach?

PTO - Paris - activit beign made in locally in badge. Let amazon know that something isnt right
on system, its being logged that something is being badged.
- hope that in order to pick up on fraud logins



1. In cybersecurity it’s sometimes useful to analyze process lineage as part of a cybersecurity investigation. In some cases one might find a typical process (e.g. explorer) launching some malicious program. Imagine we had process data with parent/child relationships stored as events in our log repository. How might you develop a data analytic technique to detect the scenario when a parent process launches a malicious child process

metric for determinign how corrupt ur computer can be
- if there is a numeric metric that can keep track of how computer has worseneed
- or virus that has been introduced
- number of popups (spam, inappropriate information)

1. They might ask about putting the data in a graph database — steer them back to a row/column structure
2. How do you deal with process chains that are more than one level deep (e.g. A → B → C → D) in this construct?

- good to create a flag for these levels
- label 1 through 5
- root cause of perofrmanc eissuses
- level 1 or level 2 or level 5
- controlled version of computer, hypothesis test for each category. signfiicnat differnec ein performance.
-

3. What kind of false positives would you see with your detection?

- application is totally safe (?) |
python package - algo said it is dangerous (?) | the team uses an oudated package, slower, less efficient, not upgraded. the latest versin will have issues with the program.
- could be improving their workflow.


4. What is difficult about this problem and does your approach scale to large datasets?

hadoop - distribute data across multiple servers
perofrm parallel processing, scale
fault tolerance,

biggest hurdles:
- beginnign stages of problem
- what variables would i be focusing on
- how would I impact target variable
- those might be initial hiccups
- stage of understandign suspicius activities

- unable to come up with implications or considerations



THINK BIG
Tell me about a time when you thought differently to improve a process that was working. What assumptions did you have to question? How did you evaluate if the change improved the process? Knowing what you know now, would you do anything differently?

- building search engine for private company
- tasked with extracting text from unstructured documents
- WE
- PDFs
- multiple ways to go about this
- disagreemnet, team ot sure how to go about this
- may come with mainetnance issues in teh future,
higher level of effort up front, later


- 1st way: use client sepcific regex patterns - easy to implement, documents were in consistent format
- 2nd way: NLP - sentence level embeddings, use to classify text

our team decided ot use regex appraoch. client prioritized search engine for looking up documents in structured side of data
unstructured docs consisted of les than 5%

regex would still bring value to client
precision of 80%
satisfied with extraction method
performing search functionality for structured side



Tell me about a time when you drove adoption for your vision/ideas. How did you know your vision/idea was adopted by others? How did you drive adoption for your vision/ideas? How did you track adoption? Would you do anything differently?

govt client
vaccine managemtn platform - quickly developed, largely increased number of covid vaccinations
preliminary design of platform did not accoutn for joined and harmonized data
providers would log inventory info, track vaccine status, etc
one data entry - automatically receiving direct integration
inventory errors

no easy solution for inventory errors
fuzzy matching





INVENT AND SIMPLIFY
Tell me about a time when you had a challenging problem or situation that the usual approach wouldn't address. How did you select an alternative approach? What alternative approach(es) did you consider? What was the end result? What was the impact?

NLP typical way - regex expression - scheudle 1:1 to understand their processses and document hisotry
changes in doc format rarely occurred over span of decade

covid project - adhoc analysis. pharmacies, hospitals, big clinics. each provider has their own way of recording data, stored on paltfrom.
certain terms mean, what they mean for platform.

one size fits all, really have to assess how they are storing their data.


Give me an example of a complex problem you solved with a simple solution. What made the problem complex? How do you know your solution addressed the problem?


- covid project - data to test, before live environment. gave us data in excel. headers are correct, are data values correct.
imported to do analysis. inventory errors.
getting duplicate results.
gone back to client.
hey, you ahve duplicate vaccnation events. why is this the case.
disagreement there.
found out that excel had been doing this weird thing . transformed to a unique identifier, dropped leading zeroes, converted to strign.
not obvious find right away. there wre instances with unique values of 0.
after back and forth, and doing analysis, excel was doing this.
very easy solution - confirm all fields were strings, import them in correct way.
receive data as csv, jupyter notebook.

     
 
what is notes.io
 

Notes.io is a web-based application for taking notes. You can take your notes and share with others people. If you like taking long notes, notes.io is designed for you. To date, over 8,000,000,000 notes created and continuing...

With notes.io;

  • * You can take a note from anywhere and any device with internet connection.
  • * You can share the notes in social platforms (YouTube, Facebook, Twitter, instagram etc.).
  • * You can quickly share your contents without website, blog and e-mail.
  • * You don't need to create any Account to share a note. As you wish you can use quick, easy and best shortened notes with sms, websites, e-mail, or messaging services (WhatsApp, iMessage, Telegram, Signal).
  • * Notes.io has fabulous infrastructure design for a short link and allows you to share the note as an easy and understandable link.

Fast: Notes.io is built for speed and performance. You can take a notes quickly and browse your archive.

Easy: Notes.io doesn’t require installation. Just write and share note!

Short: Notes.io’s url just 8 character. You’ll get shorten link of your note when you want to share. (Ex: notes.io/q )

Free: Notes.io works for 12 years and has been free since the day it was started.


You immediately create your first note and start sharing with the ones you wish. If you want to contact us, you can use the following communication channels;


Email: [email protected]

Twitter: http://twitter.com/notesio

Instagram: http://instagram.com/notes.io

Facebook: http://facebook.com/notesio



Regards;
Notes.io Team

     
 
Shortened Note Link
 
 
Looding Image
 
     
 
Long File
 
 

For written notes was greater than 18KB Unable to shorten.

To be smaller than 18KB, please organize your notes, or sign in.